Exercise "Mathematical Statistics - Jun Shao"

326 Views Asked by At

I'm trying to solve this problem:

Let $X_1, ...,X_n, (n \ge 2)$ be i.i.d. random variables having the normal distribution $N(\theta, 2)$ when $\theta=0$ and the normal distribution $N(\theta, 1)$ when $\theta \in {\rm I\!R}-\{0\}$. Show that the sample mean $\bar{X}$ is not a sufficient statistic for $\theta$.

So, first I found the sample distributions, $\bar{X} \sim N(0,2/n)$, when $\theta = 0$, and $\bar{X} \sim N(\theta,1/n)$, when $\theta \neq 0$.

Then I wrote the function as \begin{align} f_{ \theta }(x)={ \left[ { (4\pi ) }^{ -1/2 }\exp\{ \frac { -{ x }^{ 2 } }{ 4 } \} \right] }^{ I_{ \{ \theta =0\} } }{ \left[ { (2\pi ) }^{ -1/2 }\exp\{ \frac { -{ (x-\theta ) }^{ 2 } }{ 2 } \} \right] }^{ I_{ \{ \theta \neq 0\} } } \end{align}

But I'm not sure how to procedure with that. I thought of using the factorization theorem, but I'm stuck in this density. Any hint?

2

There are 2 best solutions below

2
On BEST ANSWER

Another way of expressing the density of a single observation is to write $$X \sim \operatorname{Normal}(\theta, 1 + \mathbb 1(\theta = 0));$$ that is to say, the variance is a function of $\theta$: $$\sigma_\theta^2 = 1 + \mathbb 1 (\theta = 0) = \begin{cases} 2, & \theta = 0 \\ 1, & \theta \ne 0. \end{cases}$$ Then the joint density of a sample $\boldsymbol x \in \mathbb R^n$ is given by $$\begin{align*} f(\boldsymbol x \mid \theta) &= (2\pi)^{-n/2}\sigma_\theta^{-n} \exp\left(-\sum_{i=1}^n \frac{(x_i - \theta)^2}{2\sigma_\theta^2} \right) \\ &= (2\pi)^{-n/2} (1 + \mathbb 1 (\theta = 0))^{-n} \exp \left(-\frac{\sum_{i=1}^n (x_i - \theta)^2}{2(1 + \mathbb 1 (\theta = 0))^2} \right). \end{align*}$$ I leave it as an exercise to show that $$\sum_{i=1}^n (x_i - \theta)^2 = \sum_{i=1}^n (x_i - \bar x)^2 + \sum_{i=1}^n (\bar x - \theta)^2,$$ so that $$f(\boldsymbol x \mid \theta) = (2\pi)^{-n/2} \sigma_\theta^{-n} \exp\left(-\frac{n \hat \sigma^2}{2\sigma_\theta^2}\right) \exp\left(-\frac{\sum_{i=1}^n (\bar x - \theta)^2}{2\sigma_\theta^2}\right),$$ where $\hat\sigma^2 = \frac{1}{n}\sum_{i=1}^n (x_i - \bar x)^2$ is the (biased) sample variance. If $T(\boldsymbol x) = \bar x$ were sufficient for $\theta$, then we would not have this additional factor containing $\hat\sigma^2$. Unfortunately, we cannot express $f$ in terms of $h(\boldsymbol x) g(T(\boldsymbol x) \mid \theta)$ for suitable $h$ and $g$ in such a case because the factor $\exp(-n\hat\sigma^2/(2\sigma_\theta^2))$ depends on $\theta$ through $\sigma_\theta$, and this is the crux of the problem, since if the variance did not depend on $\theta$, this factor does not depend on the sample through $\theta$ and would not need to be included in $g$; instead, it would be included in $h$. However, the two-dimensional statistic $$\boldsymbol T(\boldsymbol x) = (\bar x, \hat \sigma^2)$$ is sufficient for $\theta$ because now we can choose, for example, $h(\boldsymbol x) = (2\pi)^{-n/2}$ and $$g(T_1, T_2 \mid \theta) = \sigma_\theta^{-n} \exp \left(-\frac{-n T_2}{2\sigma_\theta^2}\right) \exp \left(-\frac{\sum_{i=1}^n (T_1 - \theta)^2}{2\sigma_\theta^2}\right).$$ The above, of course, is not entirely rigorous because we have not formally proven that $\hat\sigma^2$ is not itself a function of $\bar x$. To show this, it suffices to furnish an example of two samples $\boldsymbol x_1$, $\boldsymbol x_2 \in \mathbb R^n$, for which $\bar x_1 = \bar x_2$ but $\hat \sigma_1^2 \ne \hat \sigma_2^2$. This is quite trivial; e.g., let $n = 2$ and $\boldsymbol x_1 = (-1,1)$ and $\boldsymbol x_2 = (-10, 10)$. Then $\bar x_1 = \bar x_2 = 0$ but obviously $\hat\sigma_1^2 < \hat\sigma_2^2$. Consequently, $\hat \sigma^2$ is not a uniquely determined function of $\bar x$ for any $n \ge 2$ (since in $n > 2$ we can choose all the other observations to be equal, reducing to the $2$-dimensional case).

0
On

Let us start from the definition. Suppose $\vec{X}$ is discrete. Then $T(\vec{X})$ is sufficient if the conditional distribution of $\vec{X}=(X_1,\cdots,X_n)$ given $T(\vec{X})=T(\vec{x})$ has nothing to do with $\theta$. In this case it is equivalent to require $P(\vec{X}=\vec{x}|T(\vec{X})=T(\vec{x}),\theta)$ is a constant of $\theta$. Using the Bayes formula gives $$P(\vec{X}=\vec{x}|T(\vec{X})=T(\vec{x}),\theta)=\frac{P(\vec{X}=\vec{x}|\theta)}{P(T(\vec{X})=T(\vec{x})|\theta)}$$ In the continuous case, let $B\subset \vec{X}(\Omega)\subset \mathbb{R}^n$ (Recall each $X_i$ are measurable functions on $\mathbb{R}$). Then we know $$P(\vec{X}\in B|T(\vec{X})\in T(B),\theta)=\frac{P(\vec{X}\in B|\theta)}{P(T(\vec{X})\in T(B)|\theta)}$$ is a constant of $\theta$. Thus $$\frac{\frac{1}{|B|}P(\vec{X}\in B|\theta)}{\frac{1}{|T(B)|}P(T(\vec{X})\in T(B)|\theta)}$$ is a constant of $\theta$. Here $|B|$ denotes its volume. By sending $|B|\to 0$ we know the ratio of density functions $$\frac{f_{\vec{X}}(\vec{x}|\theta)}{f_{T(\vec{X})}(T(\vec{x})|\theta)}$$ is also a constant of $\theta$.

Now in this problem, we can calculate this raito directly: $$\begin{aligned} \frac{f_{\vec{X}}(\vec{x}|\theta)}{f_{T(\vec{X})}(T(\vec{x})|\theta)}&=\frac{(2\pi\sigma_{\theta}^2)^{-\frac{n}{2}}\exp(-\frac{1}{2\sigma_{\theta}^2}\sum_{i=1}^n (x_i-\theta)^2)}{(2\pi\frac{\sigma_{\theta}^2}{n})^{-\frac{1}{2}}\exp(-\frac{n}{2\sigma_{\theta}^2}(\bar{x}-\theta)^2)}\\ &=C\sigma_{\theta}^{1-n}\exp(-\frac{1}{2\sigma_{\theta}^2}\sum_{i=1}^n (x_i-\bar{x})^2) \end{aligned}$$ From this calculation, we can easily see the result is different when $\sigma_{\theta}^2=1$ and when $\sigma_{\theta}^2=2$. This shows that the statistic $T(\vec{X})=\bar{X}$ is not sufficient.