Distribution of sample variance: Chi-squared distribution problem

223 Views Asked by At

I'm preparing for an exam and I can't seem to figure out the reasoning behind the answer to this question.

Why do they use a chi-squared test? Can someone walk me through their explanation? Thanks.

enter image description here

2

There are 2 best solutions below

0
On BEST ANSWER

This requires focus on some fussy (but important) details, so let me try to explain it step by step:

$$P(S_X^2 > 1.5\sigma^2) = P\left(\frac{S_X^2}{\sigma^2} > 1.5\right) = P\left(Q =\frac{9S_X^2}{\sigma^2} > 9(1.5)\right)\\ = P(Q > 13.5) = 1-P(Q\le 13.5) = 0.1413 > 0.1,$$

where $Q \sim Chisq(df=9)$, and I have used R statistical software to get the exact probability, as shown below.

1 - pchisq(13.5, 9)  # in R 'pchisq' is the CDF of Chisq dist'n
## 0.1412558

I will leave it to you to look in your printed tables of the chi-squared distribution to come as near as necessary to the exact value. [In my table: along the row for df=9 and in the column for cutting 0.1 from the upper tail of the distribution, I find 14.6837. That is, $P(Q > 14.6837) = 0.1$ So $P(Q > 13.5)$ must exceed 0.1. Also, notice that subscripts in the headings of many distribution tables put the area in the right tail into the subscript (not the area to the left as for a CDF). In my table the column mentioned above is headed $\chi_{0.100}^2.$]

Below is a plot of the $Chisq(9)$ PDF. The vertical red line is at 13.5; the area under the curve to the right of this line is 0.1413. The dotted vertical green line is at 14.6837.

enter image description here

Now, I hope you can figure out the similar argument for $S_Y^2.$

1
On

$\chi_n^2 = Z_1^2 + Z_2^2 \cdots + Z_n^2$

Where $Z_i$ are i.i.d standard normal random variables. $n$ is the "degrees of freedom". This is the definition.

So now, how do we estimate our spread parameters? We calculate the mean squared error. That is:

$s^2 = \frac 1{n-1} \sum_\limits{i=1}^n (X_i-\bar X)^2$

$s^2$ is an estimate with uncertainty. It is a random variable itself. And what distribution describes our uncertainty in $s^2$

$X_i-\bar X$ are normally distributed random variables.

$\sum_\limits{i=1}^n (X_i-\mu)^2 = \sigma^2 \chi_n^2$

And we loose a degree of freedom from our estimate of $\bar X$

$\sum_\limits{i=1}^n (X_i-\bar X)^2 = \sigma^2 \chi_{n-1}^2$