Ask the detail of Neyman-Pearson Lemma (Necessity Part)

165 Views Asked by At

This is Theorem 8.3.12 (Neyman-Pearson Lemma) in George Casella stat inference. Consider testing $H_0: \theta=\theta_0$ versus $H_1: \theta=\theta_1$, where the pdf pr pmf corresponding to $\theta_i$ is $f(\textbf{x}|\theta_i),i=0,1.$, using a test with rejection region R that satisfies (8.3.1)

\begin{align} \begin{matrix} x\in R & \text{if}& f(\textbf{x}|\theta_1)>kf(\textbf{x}|\theta_0)\\ x\in R^c & \text{if} & f(\textbf{x}|\theta_1)<kf(\textbf{x}|\theta_0) \end{matrix}\tag{8.3.1}\label{8.3.1} \end{align} for some $k \geq 0$, and \begin{align} \alpha=P_{\theta_0}(\textbf{x}\in R)\tag{8.3.2}\label{8.3.2} \end{align}

Then (Sufficiency): Any test that satisfies the above is a UMP level $\alpha$ test.

(Necessity): If there exists a test satisfying (8.3.1) and (8.3.2) with k >0, then every UMP level $\alpha$ test is a size $\alpha$ test (satisfies (8.3.2)) and every UMP level $\alpha$ test satisfies (8.3.1) except perhaps on a set A satisfying $P_{\theta_0}(\textbf{x}\in A)=P_{\theta_1}(\textbf{x}\in A)=0.$

I feel difficulty to understand this Necessity. In the proof, it is clear that any test satisfying (8.3.2) is a size $\alpha$ test. So I don't know why in the Necessity part, we say it again:

If there exists a test satisfying (8.3.1) and (8.3.2) with k >0, then every UMP level $\alpha$ test is a size $\alpha$ test (satisfies (8.3.2))

. And the expressions are different. It says we need to satisfy (8.3.1) and (8.3.2). But don't the truth be we only need to satisfy (8.3.2)?

1

There are 1 best solutions below

5
On BEST ANSWER

Before trying sketching any arguments, let me introduce some notation and remarks.

Let $T_*(x)$ denote the test described by $(8,3,1)$, that is \begin{align} T_*(x)=\left\{\begin{matrix} 1& \text{if}& f(\textbf{x}|\theta_1)>kf(\textbf{x}|\theta_0)\\ \gamma &\text{if} & f(\textbf{x}|\theta_1)=kf(\textbf{x}|\theta_0)\\ 0 & \text{if} & f(\textbf{x}|\theta_1)<kf(\textbf{x}|\theta_0) \end{matrix} \right. \end{align} where $k$ and $\gamma$ is taken so that $$E_{\theta_0}[T_*(X)]=P_{\theta_0}\big[f(X|\theta_1)>k f(X|\theta_0\big]+\gamma P_{\theta_0}\big[f(X|\theta_1)=k f(X|\theta_0)\big]=\alpha $$ Notice that The function $t\mapsto P_{\theta_0}[f(X|\theta_1)>t f(X|\theta_0)]$ is positive, monotone nonincreasing and right continuous, so such $k$ and $\gamma$ exists and are unique. If one is dealing with continuous distributions, then the bit $\gamma P_{\theta_0}\big[f(X|\theta_1)=k f(X|\theta_0)\big]$ does not appear.

To show necessity, that is, that any UMP test $T_u(x)$ at level $\alpha$ satisfies $T_u(x)=T_*(x)$ in $\{x:f(x|\theta_1)\neq f(x|\theta_0)\}$, we should understand first what is so special about $T_*(X)$. Once this is established, necessity follows from some basic measure theoretic arguments that I will only explain towards the end of my answer.


First we reproduce the arguments that show why $T_*(x)$ is a UMP at level $\alpha$, that is, we show that for any other test $T(x)$ with $E_{\theta_0}[T(X)]\leq\alpha$, we have that $E_{\theta_1}[T_*(X)]\geq E_{\theta_1}[T(X)]$.

Here is more or less how the argument works.

Let $T(x)$ by any other test with power at most $\alpha$, that is $E_{\theta_0}[T(X)]\leq\alpha$. Recall that tests take only values between $0$ and $1$. Notice that:

  1. If $T_*(x)-T(x)>0$, then $T_*(x)>0$ (for $T(x)\geq0$ for all $x$); hence $f(\textbf{x}|\theta_1)\geq kf(\textbf{x}|\theta_0)$.
  2. If $T_*(x)-T(x)<0$, then $T_*(x)<1$ (for $T(x)\leq1$ for all $x$); hence $f(\textbf{x}|\theta_1)\leq kf(\textbf{x}|\theta_0)$.

This means that $$ \big(T_*(x)-T(x)\big)\big(f(x|\theta_1)-kf(x|\theta_0)\big)\geq0 $$ for all $x$. Integration gives \begin{align} \int_X\big(T_*(x)-T(x)\big)\big(f(x|\theta_1)-kf(x|\theta_0)\big)\,dx\geq0 \end{align} simplifying the expression on the left-hand side gives \begin{align} E_{\theta_1}[T_*(X)-T(X)]&=\int_X\big(T_*(x)-T(x)\big)f(x|\theta_1)\,dx\\ &\geq k\int_X (T_*(x)-T(x)\big)f(x|\theta_0)\,dx=kE_{\theta_0}[T_*(X)-T(X)]\\ &=k\big(\alpha-E_{\theta_0}[T(X)]\big)\geq0 \end{align} Hence $$ E_{\theta_1}[T_*(X)]\geq E_{\theta_1}[T(X)]$$

In other words, $T_*$ is more powerful that $T$.


Observation: The key parts of the argument above are contained (1) and (2).


We are now ready to argue for necessity, that is, that any test $T_u$ that is UMP at level $\alpha$, must be equal to $T_{*}(X)$ in $\{x: f(x|\theta_1)\neq k f(x|\theta_0)\}$.

Suppose now that that $T_u$ is another UMP of power $\alpha$; that is $E_{\theta_0}[T_u(X)]=\alpha$, and $E_{\theta_1}[T_u(X)]\geq E_{\theta_1}[T(X)]$ for any other feasible test $T$. Then, since $T_*(X)$ is UMP, we must have that $E_{\theta_1}[T_*(X)]=E_{\theta_1}[T_u(X)]$. Consider the set $$ A:=\{x:T_*(x)\neq T_u(x)\}\cap\{x:f(x|\theta_1)\neq k f(x|\theta_0)\}$$ The arguments used in (1) and (2) imply that \begin{align} \begin{matrix} \big(T_*(x)-T_{u}(x)\big)\big(f(x|\theta_1)-k f(x|\theta_0)\big)>0&\text{if} &x\in A\\ \big(T_*(x)-T_{u}(x)\big)\big(f(x|\theta_1)-k f(x|\theta_0)\big)=0&\text{if} &x\in X\setminus A \end{matrix} \end{align} Integration gives \begin{align} \int_A\big(T_*(x)-T_{u}(x)\big)\big(f(x|\theta_1)-k f(x|\theta_0)\big)\,dx &=\int_X\big(T_*(x)-T_{u}(x)\big)\big(f(x|\theta_1)-k f(x|\theta_0)\big)\,dx\\ &=E_{\theta_1}[T_*(X)-T_{u}(X)]-k E_{\theta_0}[T_*(X)-T_{u}(X)]\\ &=0 \end{align} Since $\big(T_*(x)-T_{u}(x)\big)\big(f(x|\theta_1)-k f(x|\theta_0)\big)>0$ for all $x\in A$, then it must be that $A$ is negligible (i.e. $\int_A\,dx=0$). Therefore $P_{\theta_0}(A)=P_{\theta_1}(A)=0$, that is $T_*(X)=T_{u}(X)$ $\{P_{\theta_1},P_{\theta_0}\}$-almost surely.


The last bit is based on a couple of basic measure theory facts:

  1. If $f\geq0$ and $\int f\,d\mu=0$, then $\mu(\{x:f(x)>0\})=0$. That is $f$ must be $0$ almost surely (with respect to the measure $\mu$).
  2. If $\mu$ is a finite measure with a density function respect to another ($\sigma$-finite) measure $\nu$, then $\nu(A)=0$ implies that $\mu(A)=0$. (this is related to a deep result called Radon-Nikodym theorem).