Proof of Pearson's chi squared test

10.2k Views Asked by At

i was reading proof of this theorem on http://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2003/lecture-notes/lec23.pdf

They showed, that $\frac{v_j-np_j}{\sqrt{np_j}} \stackrel{D}{\longrightarrow} N(0,1-p_j)$. I don't understand however why

$\sum_{j=1}^r \frac{(v_j-np_j)^2}{np_j} \stackrel{D}{\longrightarrow} \sum_{i=1}^r Z_i^2$

holds?

I know that if $X_n \stackrel{D}{\longrightarrow} X$, then for every continuous function $f$ we have $f(X_n) \stackrel{D}{\longrightarrow} f(X)$, so $\frac{(v_j-np_j)^2}{np_j} \stackrel{D}{\longrightarrow} Z_j^2$. But I know as well, that it's not true that $X_n \stackrel{D}{\longrightarrow} X$ and $Y_n \stackrel{D}{\longrightarrow} Y$ imply $X_n+Y_n \stackrel{D}{\longrightarrow} X+Y$.

3

There are 3 best solutions below

1
On BEST ANSWER

If $X_n$ and $Y_n$ are independent then it's true: the characteristic function of $X_n + Y_n$ is $E[e^{iu(X_n + Y_n)}] = E[e^{iuX_n}]E[e^{iuY_n}]$

Since $X_n$ and $Y_n$ converge in distribution to $X$ and $Y$, we have

$$\lim_{n \to \infty} E[e^{iu(X_n + Y_n)}] = E[e^{iuX}]E[e^{iuY}] =E[e^{iu(X + Y)}] $$

Since the characteristic function of $X_n + Y_n$ is tending to the characteristic function of $X + Y$, you can conclude that $$X_n + Y_n \stackrel{D}{\longrightarrow} X+Y$$

1
On

In the attached paper Z_i is distributed N(0,1-p_i) for all i as stated in the paper (underscore is used to denote subscripts here). So since (v_i - np_i)/sqrt(np_i) converges to N(0,1-p_i) for all i, we have that (v_i - np_i)^2/np_i converges to (Z_i)^2 for all I. That should clear up the first part.

Also it turns out that X_n -> X and Y_n -> Y does imply X_n + Y_n -> X + Y.

Proof: let eps > 0 be given and choose N sufficiently large such that |X_n - X|,|Y_n - Y| < eps/2 for all n >= N. Then |(X_n - X) + (Y_n - Y)| <= |X_n - X| + |Y_n - Y| < eps/2 + eps/2 = eps for all n >= N. Since eps > 0 was chosen arbitrarily, we have that X_n + Y_n -> X + Y.

10
On

If we write $Z_i = \frac{O_i-np_i}{\sqrt{np_i}}$ as in the leture notes, the idea is that the vector $Z\to\mathcal N(0,\Sigma)$, a multivariate normal distribution, where

$$\Sigma=\text{Cov}(Z)=\begin{bmatrix} 1-p_1 & -\sqrt{p_1 p_2} & \cdots \\ -\sqrt{p_1 p_2} & 1-p_2 & \cdots \\ \vdots & \vdots & \ddots \end{bmatrix}.$$

If we compute $\text{Det}(\Sigma-\lambda I)=(1-\lambda)^{n-1}\lambda$ we get that $\Sigma$ has $n-1$ eigenvalues that are 1 and one that is 0. (The computation is made easy by the fact that $\Sigma=I-pp^T$ for $p=(\sqrt{p_1},\sqrt{p_2},\dots)$, and Sylvester's theorem.)

This means the distribution is really $n-1$ dimensional embedded in $n$ dimensions, and there is a rotation matrix $A$ that makes

$$A\Sigma A^T=\begin{bmatrix} 0 & 0 & 0 & \cdots \\ 0 & 1 & 0 & \cdots \\ 0 & 0 & 1 & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{bmatrix}.$$

Now let $X = AZ \sim N(0,A\Sigma A^T)$. Then $X$ is a vector $(0, X_1, X_2, \dots)$ of iid. $\mathcal N(0,1)$ gaussians. The function $f(Z) = Z_1^2 + Z_2^2 + \dots$ is the norm $\|Z\|_2^2$, and hence it doesn't change when we rotate its argument. This means $f(Z) = f(AZ) = f(X) = 0^2 + X_1^2 + \dots$, which is Chi-square distributed!

Pretty cool stuff. Thank you for pointing me towards this result :-)