Why can a $\chi^2$ test be used to check whether a random variable has any distribution over a finite space?

78 Views Asked by At

In particular, I see it used to test whether a random variable is Uniform even though the graph of the $\chi^2$ pdf doesn't ever seem to look much like a uniform distribution's pdf, and I don't understand why this is.

1

There are 1 best solutions below

3
On BEST ANSWER

I'll try to give an intuitive explanation rather than focus on the rigorous details, because those aren't too hard to find if you're interested.

First, an instructive example: consider a random variable $X$ that is uniformly distributed on $[0, 1]$. If you generate 100 copies of this random variable, how many of them will be less than $0.2$?

You're probably familiar with the binomial distribution, so you know it's relevant here; the number that land in that range will be binomially distributed with parameters $n = 100, p = 0.2$. If you run this experiment many times and plot the number of trials that land in that range, the number will be roughly bell-shaped with a peak around $20$. Even though there was nothing about the normal distribution in the problem, one appeared (essentially due to the magic of the Central Limit Theorem).

The $\chi^2$-distribution shows up in the application you mentioned for extremely similar reasons. So similar, in fact, that I'll start with the above description as a basis. Suppose you have $n$ data points, each of which can appear in $k$ different categories with various probabilities.

  • Fix one of your categories. The number of the $n$ data points that appears in that category is binomially distributed; a binomial variable can be very well-approximated by a normal distribution.
  • A normal distribution can be shifted and scaled to have the standard normal distribution.
  • A squared standard normal distribution has a $\chi^2$-distribution.

The basic idea is to look at each category and measure something like the above, which should have a $\chi^2$ distribution. When you add indepenedent $\chi^2$ variables, you get a $\chi^2$-distribution with a higher degrees-of-freedom parameter. This gets complicated because the variables (which in this case are the counts of data points that land in each category) are indeed dependent, and this is why the degrees of freedom of the statistic is one less than the number of categories. However, it can be shown that the resulting statistic obtained by counting, shifting, scaling, squaring, and adding should indeed have an approximate $\chi^2$ distribution.

EDIT: To make this answer a bit more helpful since you asked for a reference for a rigorous proof: look at the theorem beginning at the bottom of page 2 here.