Chi square test, double counting?

147 Views Asked by At

We're testing whether three species are affected differently by some poison, getting this table (type/dead/alive/tot):

A, 10, 20, 30 B, 16, 14, 30 C, 22, 8, 30

Under H0, we're expeting 16 to live and 14 to die. Apparently the correct chi square test statistic involves summing up the both the squared difference in the survivors, ASWELL as the differences in the dead plants. This sounds extremely weird to me. Can someone explain why it's not enough to just look at the differences in the survivors? Why is including the 100% correlated data necessary?

1

There are 1 best solutions below

0
On

You'd like to compare one row of the chi-square test with the test for proportion. For example, $n=10, p_0 = 0.3$. Then the test for proportion gives you

$$Z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n }} = \frac{n\hat{p} - np_0}{\sqrt{np_0(1-p_0)}} $$

and $$Z^2 = \frac{(n\hat{p} - np_0)^2}{np_0(1-p_0)}$$

You already see the beginnings of the chi-square test. Now, the row sum for the chi-square test is

$$\chi^2 = \frac{(n\hat{p} - np_0)^2}{np_0} + \frac{(n(1-\hat{p}) - n(1-p_0))^2}{n(1-p_0)} $$

After some algebra, you'll get $\chi^2 =Z^2$.

You do get two terms with the same numerator, but denominators are not the same, in general.