Definition of the likelihood ratio statistic

210 Views Asked by At

The likelihood ratio statistic for testing $H_0:\theta\in\Theta_0$ versus $H_1:\theta\in\Theta_0^c$ is usually defined by $$ \lambda(\mathbf x) =\frac{\sup_{\theta\in\Theta_0}L(\theta\mid\mathbf x)}{\sup_{\theta\in\Theta} L(\theta\mid\mathbf x)}, $$ where $\Theta$ is the whole parameter space and $\Theta_0\subset\Theta$ (see, for example, Secion 8.2.1 of Casella and Berger (2002)).

Why is the supremum in the denominator taken over all $\Theta$ instead of $\Theta_0^c$? Would it make sense to define the likelihood ratio statistic when the supremum in the denominator is taken over $\Theta_0^c$? If the supremum in the denominator was taken over $\Theta_0^c$, $\lambda(\mathbf x)$ would not necessarily satisfy $0\le\lambda(\textbf x)\le1$, but it seems that this choice would be more intuitive since we would compare the likelihood when $H_0$ is true with the likelihood when $H_1$ is true. Or are these two options ($\Theta$ and $\Theta_0^c$) actually equivalent?

Any help is much appreciated!

1

There are 1 best solutions below

0
On

Suppose $\boldsymbol X=(X_1,\ldots,X_n)$ is a random vector whose distribution is parameterized by $\theta$, where $\theta\in \Theta\subseteq \mathbb R^p$. Let $L(\theta\mid \boldsymbol x)$ be the likelihood function given the sample $\boldsymbol x=(x_1,\ldots,x_n)$.

In general we consider the problem of testing the null $H_0:\theta\in \Theta_0$ against the alternative $H_1:\theta\in \Theta_1$, where $\Theta_0\subset \Theta$ and $\Theta_1\subseteq \Theta-\Theta_0$.

We prefer $H_0$ to $H_1$ ($H_1$ to $H_0$) if $$\sup_{\theta}\{L(\theta\mid \boldsymbol x):\theta\in\Theta_0\}>(<) \sup_{\theta}\{L(\theta\mid \boldsymbol x):\theta\in\Theta_1\}$$

When $H_0$ is true (false), the ratio $$r(\boldsymbol x)=\frac{\sup_{\theta\in\Theta_0}L(\theta\mid \boldsymbol x)}{\sup_{\theta\in\Theta_1}L(\theta\mid \boldsymbol x)}$$

is expected to be large (small). But $r$ is not bounded above.

So we modify $r(\boldsymbol x)$ by

$$\Lambda(\boldsymbol x)=\frac{\sup_{\theta\in\Theta_0}L(\theta\mid \boldsymbol x)}{\sup_{\theta\in \Theta_0 \cup \Theta_1}L(\theta\mid \boldsymbol x)}=\frac{\sup_{\theta\in\Theta_0}L(\theta\mid \boldsymbol x)}{\sup_{\theta\in \Theta}L(\theta\mid \boldsymbol x)}$$

If $H_0$ is true (false), then as before, $\Lambda$ is expected to be large (small).

However we now have $\Lambda\in (0,1]$, where we trivially accept (rather fail to reject) $H_0$ whenever $\Lambda=1$.

This justifies a left-tailed test based on $\Lambda$, and $\Lambda(\boldsymbol X)$ is called the likelihood ratio criterion.