When doing a $\chi^2$-test on deciding whether or not an RV follows a certain distribution, we perform $n$ observations of the RV, calculate the frequency that the observed value lies inside certain intervals, called classes (and do everything we do to draw a histogram). Then, we calculate the expected frequency of each of such intervals. The observed frequencies are $O_k$ and expected frequencies $E_k$.
We calculate $$ \chi=\sum\frac{(O_k-E_k)^2}{E_k} $$ and compare to the values given on the table.
I am having the following questions:
- $\chi^2$ is the sum of several standard normal distributions. However, the random variable $\frac{(O_k-E_k)^2}{E_k}$ is certainly not always normal. My book stipulate that each class contain at least $5$ elements, so by central limit theorem it becomes approximately normal. But how could I know if $5$ is large enough? Usually, for many other types of hypothesis tests, we must have $n>30$ or something for the approximation to be accurate. $5$ is much smaller than this. Why this is the case?
- Different ways of dividing the data into classes could lead to different test results. How could I know which division of class is the best choice? Let's measure "how similar" two distributions are by using $L^2$ distance between density functions. Then there is no evidence showing which division of class gives the best result (i.e. minimise the probability of type II error while having the same significance level). How could I find the "uniformly best test" (if it exists)?