Computing Chi-Square p-value with large degree of freedom (df).

380 Views Asked by At

I'm using a Chi-square test on a large set of data points, so my degrees of freedom is rather large. So far, none of the online tables I've been able to find have options for $4095$ degrees of freedom.

Is a chi-square test still applicable with this large of a dataset? Is there some way to find the p-value for $d.f. = 4065$?

2

There are 2 best solutions below

1
On BEST ANSWER

Thanks to Bartek's answer I realized the results are still applicable, but I'd need to compute the p-values myself for different significance levels. I'll just hard-code these for my usage.

Here are my results. If anyone notices an error in them please point them out.

For $d.f.=4095$:

\begin{array}{|c|c|} \hline \texttt{Significant} & X^2 \\ \hline .99 & 3887.41 \\ \hline .95 & 3947.29 \\ \hline .90 & 3979.46 \\ \hline .75 & 4033.60 \\ \hline .50 & 4094.33 \\ \hline .25 & 4155.67 \\ \hline .10 & 4211.40 \\ \hline .05 & 4244.99 \\ \hline .025 & 4274.26 \\ \hline .01 & 4308.47 \\ \hline \end{array}

1
On

If $X_1$, $X_2$, $\dots$, $X_n \sim \mathcal{N}(0,1)$ then we have: $$X_1^2+...+X_n^2 \sim \chi^2_n$$ And by CLT as $n$ tends to infinity we have: $$\sqrt{n}(\frac{X_1^2+...+X_n^2}{n}-1) \rightarrow \mathcal{N}(0,2)$$ So for large values of $n$ we should have: $$\chi_n^2(x) \approx \Phi(\frac{x-n}{\sqrt{2n}})$$ This approximation for $n=4096$ has the worst results for $x = 4096 \pm 157$ but even then it is off by less than $0.0013$.