Distribution of likelihood ratio in a test on the unknown variance of a normal sample

5.4k Views Asked by At

EDIT: I have followed up to this discussion with a second question: https://math.stackexchange.com/questions/635567/hypothesis-test-on-variance-of-normal-sample

I am preparing for a stat exam and I was trying to derive the distribution of the likelihood ratio statistic for the hypothesis test below.

Let $X_1 ... X_{n}$ be a random sample from a $N(\mu,\sigma^2)$ distribution, where $\mu$ is known and $\sigma^2$ is unknown. I want to test the hypothesis $H_0 : \sigma^2 = \sigma_{0}^{2} $ vs. $H_1 : \sigma^2 \neq \sigma_{0}^{2}$ (and, trivially, $\sigma^2 >0$).

The generic joint pdf for the n independent random variables (ie. the likelihood function for the random sample) is:

$L=\prod_{i=1}^{n} \large(\frac{1}{\sqrt{2\pi\sigma^2}})\cdot e^-\frac{(X_i - \mu)^2}{2\sigma^2}= \large(\frac{1}{\sqrt{2\pi\sigma^2}})^{n}\cdot e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma^2}$

Under the null hypothesis, the maximum value taken by $L$ is: $\large(\frac{1}{\sqrt{2\pi\sigma_{0}^{2}}})^{n}\cdot e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}$

If we do not constrain $\sigma^{2}$ to be equal to $\sigma_{0}^{2}$, then $L$ is maximised by deriving the maximum likelihood estimator for $\sigma^{2}$, ie $\hat{\sigma}^{2}=\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{n}$

In this case, the maximum likelihood becomes: $\large(\frac{1}{\sqrt{2\pi\hat{\sigma_{0}}^{2}}})^{n}\cdot e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\hat{\sigma_{0}}^{2}}$

Setting these as numerator and denominator, respectively, I get the following likelihood ratio statistic

$\Lambda = \LARGE\frac{\large(\frac{1}{\sqrt{2\pi\sigma_{0}^{2}}})^{n}\cdot e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}{\large(\frac{1}{\sqrt{2\pi\hat{\sigma_{0}}^{2}}})^{n}\cdot e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\hat{\sigma_{0}}^{2}}} \\ = \large(\frac{\hat{\sigma_{0}}^{2}}{\sigma_{0}^{2}})^{n/2}\cdot {e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}\cdot{e^\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\hat{\sigma_{0}^{2}}}} \\ = \large(\frac{\hat{\sigma_{0}}^{2}}{\sigma_{0}^{2}})^{n/2}\cdot {e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}\cdot{e^{\sum_{i=1}^{n} (X_i - \mu)^2\cdot0.5\cdot\frac{n}{\sum_{i=1}^{n} (X_i - \mu)^2}}} \\ = \large(\frac{\hat{\sigma_{0}}^{2}}{\sigma_{0}^{2}})^{n/2}\cdot {e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}\cdot{e^{0.5}}$

Since the test corresponds to $\Lambda \leq k$ for some constant k, we can write:

$\Lambda= \large(\frac{\hat{\sigma_{0}}^{2}}{\sigma_{0}^{2}})^{n/2}\cdot {e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}\cdot{e^{0.5}} \leq k$

And hence: $\large(\frac{\hat{\sigma_{0}}^{2}}{\sigma_{0}^{2}})^{n/2}\cdot {e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}} \leq k'$

When I get here, there are two things I do not understand:

  • I do not think that you can also bring ${e^-\frac{\sum_{i=1}^{n} (X_i - \mu)^2}{2\sigma_{0}^{2}}}$ to the right side as this is a function of the random sample. Am I missing something?
  • If the previous statement is correct, how do you find the distribution of the left hand side?

Any clarification, link or reference would be extremely helpful.

1

There are 1 best solutions below

9
On BEST ANSWER

As you yourself write, the maximized likelihood given the sample is

$$L(\hat{\sigma}^{2} \mid \mathbf x) = \left(\frac{1}{\sqrt{2\pi\hat{\sigma}^{2}}}\right)^{n}\cdot e^-\frac{\sum_{i=1}^{n} (x_i - \mu)^2}{2\hat{\sigma}^{2}}$$

and you have that

$$\hat{\sigma}^{2}=\frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n}$$

Inserting this into the likelihood we get

$$L(\hat{\sigma}^{2} \mid \mathbf x) = \left(\frac{1}{\sqrt{2\pi\hat{\sigma}^{2}}}\right)^{n}\cdot e^{-(n/2)} $$

Then the Likelihood Ratio is

$$\Lambda = \frac{ \left(\frac{1}{\sqrt{2\pi\sigma_0^2}}\right)^{n/2}\cdot e^{-\frac{\sum_{i=1}^n (X_i - \mu)^2}{2\sigma_0^2}}} {\left(\frac{1}{\sqrt{2\pi\hat{\sigma}^{2}}}\right)^{n}\cdot e^{-(n/2)}} = \left(\frac {\hat{\sigma}^2}{\sigma_0^2}\right)^{n/2} \cdot \exp\left \{-\frac 12\left(\frac{\sum_{i=1}^n (X_i - \mu)^2}{\sigma_0^2}-n\right)\right\}$$

and using again the expression for $\hat \sigma^2$ we get

$$\Lambda = \left(\frac {\hat{\sigma}^2}{\sigma_0^2}\right)^{n/2} \cdot \exp\left \{-\frac n2\left(\frac {\hat{\sigma}^2}{\sigma_0^2}-1\right)\right\}$$

Now, under the null, the random variable denoted

$$z_i^2 = \left(\frac {x_i - \mu}{\sigma_0}\right)^2 \sim \chi^2(1)$$

We have

$$ \frac {\hat{\sigma}^2}{\sigma_0^2} = \frac 1n\sum_{i=1}^{n} \left(\frac {x_i - \mu}{\sigma_0}\right)^2 = \frac 1n \sum_{i=1}^{n}z_i^2$$

So we can write $$\Lambda = \left(\frac 1n \sum_{i=1}^{n}z_i^2 \right)^{n/2} \cdot \exp\left \{-\frac n2\left( \frac 1n \sum_{i=1}^{n}z_i^2-1\right)\right\}$$

Taking minus log we have

$$-\ln \Lambda = -\frac n2 \ln \left(\frac 1n \sum_{i=1}^{n}z_i^2 \right) +\frac n2\left( \frac 1n \sum_{i=1}^{n}z_i^2-1\right)$$

Manipulating the second term in the RHS,

$$= -\frac n2 \ln \left(\frac 1n \sum_{i=1}^{n}z_i^2 \right) + \sqrt {\frac n 2} \left( \frac {\sum_{i=1}^{n}(z_i^2-1)}{\sqrt {2n}}\right)$$ and multiplying throughout by $\sqrt {\frac 2n}$ we obtain

$$-\sqrt {\frac 2n} \ln \Lambda = -\sqrt {\frac n 2} \ln \left(\frac 1n \sum_{i=1}^{n}z_i^2 \right) + \left( \frac {\sum_{i=1}^{n}(z_i^2-1)}{\sqrt {2n}}\right)$$

The second term in the RHS is a standardized sum of i.i.d $\chi^2(1)$ random variables, each having mean equal to $1$, variance equal to $2$, and so the standard deviation of the sum is $\sqrt {2n}$. This quantity will converge to a $N(0,1)$. Then (abusing notation a bit),

$$\operatorname{plim}\left(-\sqrt {\frac 2n} \ln \Lambda\right) = \operatorname{plim}\left(-\sqrt {\frac n 2} \right)\cdot \operatorname{plim}\left[\ln \left(\frac 1n \sum_{i=1}^{n}z_i^2 \right)\right] + \operatorname{plim}\left( \frac {\sum_{i=1}^{n}(z_i^2-1)}{\sqrt {2n}}\right) $$

$$= \infty \cdot \left[\ln \left(\operatorname{plim}\frac 1n \sum_{i=1}^{n}z_i^2 \right)\right] + N(0,1)=\infty \cdot \ln (1) + N(0,1) = 0 + N(0,1)$$

(accepting $\infty \cdot 0 =0$)

which means that the quantity

$$Q = -\sqrt {\frac 2n} \ln \Lambda \rightarrow_d N(0,1)$$

if the null hypothesis is true. For a finite sample this quantity is calculated as

$$\hat Q(\hat\sigma^2,\sigma_0^2,n) = \sqrt {\frac n2}\left[\frac {\hat{\sigma}^2}{\sigma_0^2}-1-\ln \left(\frac {\hat{\sigma}^2}{\sigma_0^2}\right)\right]$$