I found this table:
Type/Income| very low | low | middle low | middle high | high | very high
Food 38% 34% 30% 28% 26% 19%
Non-food 62% 66% 70% 72% 74% 81%
I found interesting the fact that no matter how much income one has, the amount used (percentagelike) in food and non-food products doesn't seem to change significantly. I would like to test for this. I used a Pearson's $\chi^2$-test and made a table of expected percentages (using the typical method of multiplying and dividing by the total which is 600%)
Type/Income| very low | low | middle low | middle high | high | very high
Food 29% 29% 29% 29% 29% 29%
Non-food 71% 71% 71% 71% 71% 71%
The statistic $\chi^2=\sum_{\mbox{cells}} (O-E)^2/E$ gave me: $$\chi^2 = 0.10739$$ which I guess I have to see as: $10.739\%$. The critical value at $\alpha=0.05$ with $(n-1)(m-1)=5$ degrees of freedom is $\chi_{0.05}^2 = 11.1$ so we fail to reject the null hypothesis and conclude that the income has no effect on percentage used in food/non-food products.
Is all this reasoning correct? Because I am used to this kind of test for tables with "indivivuals" in each category, but it seems reasonable to me to use it for percentage. Are the assumptions reasonable too?
If this were not the path to follow. How could I test for such hypothesis?
Thank you very much for any help or information you may have!
Sorry, but your procedure is not correct. In fact, no correct chi-squared test for independence can be performed if only percentages are available.
The observed and expected quantities used to compute the $\chi^2$ statistic must be counts, not percentages or fractions. Let me illustrate with a simple example:
Suppose we have three columns (levels of one categorical variable, maybe education), two rows (levels of a second categorical variable, maybe party), and we have fractions of people surveyed who prefer Candidate A in the cells of a matrix. Specifically, consider the matrix
FRACbelow.First, let's suppose each cell is based on 30 people. Then matrix
MAT1shows the counts, and a chi-squared test (with P-value > .05) does not show significant evidence of association. [The null hypothesis that categorical variables Education and Party are independent (as to preference for Candidate A) cannot be rejected.]By contrast, if each cell is based on 100 people, then we have
MAT2of counts, and highly significant evidence of association.Notice that in a chi-squared test the degrees of freedom are based on the number of levels of the two categorical variables, not on the sample size. (Sample size does enter into the formula for the power of the chi-squared test.)
This is one reason that a bar chart based on percentages is not suitable for publication unless count information is provided in a caption or on axes. Bar charts for
MAT1andMat2would be identical, except for information on a count axis.If you are confident that your percentages are based on hundreds of people (rather than a few dozen) then there may be evidence of association, but there is no way to know for sure in a qualitative sense, and certainly no hope of getting a P-value for a chi-squared test.