How to show that MSE of ML estimator is greater than that of Bayesian posterior mean?

214 Views Asked by At

This question is based on problem 9 from chapter 4 of Gelman et al.'s Bayesian Data Analysis.

Suppose we observe $y\sim N(\theta,\sigma^2)$ and wish to estimate $\theta$, with $\sigma^2$ known. We also know that $\theta\in [0,1]$. We compare two estimators:

  1. The MLE restricted to $[0,1]$, or
  2. The posterior mean of $\theta$, where the prior for $\theta$ is Uniform$(0,1)$.

The challenge is to show that, if $\sigma^2$ is large enough, then estimator (2) has lower MSE than estimator (1).

What I have tried so far: I'm having a hard time getting off the ground on this one. I know that estimator (1) is equal to $y$ for $y\in[0,1]$, and is equal to $0$ or $1$ if $y$ is below or above $[0,1]$, respectively. Therefore the MSE is given by: $$ \mathbb{E}(\theta-\hat\theta_{MLE})^2 = \int_{-\infty}^0 \theta^2 p(y|\theta)dy+\int_0^1(\theta-y)^2p(y|\theta)dy + \int_1^\infty (\theta-1)^2p(y|\theta)dy $$ where $p(y|\theta)$ is the N$(\theta,\sigma^2)$ pdf. But this doesn't seem very productive, and I can't see how to get anything useful out of it.

1

There are 1 best solutions below

0
On BEST ANSWER

I will show that as $\sigma \rightarrow \infty$, the MLE tends towards $$\hat{\theta}_{MLE} = \begin{cases} 0 & \text{with probability } \frac{1}{2}, \\ 1 & \text{with probability } \frac{1}{2}, \end{cases}$$ and the posterior mean, $\theta_{P}$, tends towards $$\theta_P = \frac{1}{2},$$ which has lower MSE.

Recall that we know $\theta \in [0, 1]$. As $\sigma \rightarrow \infty$, $\mathbb{P}(y \in [0, 1] \mid \theta, \sigma)\rightarrow 0$. Also, both probabilities $\mathbb{P}(y < 0 \mid \theta, \sigma)$ and $\mathbb{P}(1 < y \mid \theta, \sigma)$ tend to $\frac{1}{2}$. Hence as $\sigma \rightarrow \infty$, we see that $\hat{\theta}_{MLE}$ tends to the function given above.

Due to the choice of prior and posterior, the density of the posterior is equal to $$p(\theta \mid y, \sigma) = \frac{p(y \mid \theta, \sigma)}{\int_0^1 p(y \mid \theta, \sigma) d\theta}$$ for $\theta \in [0, 1]$ and $p(\theta \mid y, \sigma) = 0$ for $\theta \notin [0, 1]$. As $\sigma \rightarrow \infty$, and $\theta$ stays in the interval $[0, 1]$, the density $p(y \mid \theta, \sigma)$ tends to a constant in the interval $[0, 1]$. We can check that as $\sigma \rightarrow \infty$, the ratio of densities $$\frac{p(y = 1 \mid \theta, \sigma)}{p(y = 0 \mid \theta, \sigma)} = \frac{\exp(-(1 - \theta)^2/2\sigma^2)}{\exp(-(0 - \theta)^2/2\sigma^2)} = \exp\left( \frac{2\theta - 1}{2\sigma^2} \right) \rightarrow 1.$$ If the density of $p(y \mid \theta, \sigma)$ is a constant $C$ in the interval $[0, 1]$, we have the posterior mean $$\theta_P = \frac{\int_0^1 \theta p(y \mid \theta, \sigma) d\theta}{\int_0^1 p(y \mid \theta, \sigma) d\theta} = \frac{\int_0^1 \theta C d\theta}{\int_0^1C d\theta} = \frac{1}{2}.$$

Calculating the MSEs as $\sigma \rightarrow \infty$: $$\mathbb{E}\left((\theta - \hat{\theta}_{MLE})^2\right) \rightarrow \frac{1}{2}(\theta - 0)^2 + \frac{1}{2}(\theta - 1)^2 = \theta^2 - \theta + \frac{1}{2}$$ and $$\mathbb{E}\left((\theta - \theta_P)^2\right) \rightarrow \left(\theta - \frac{1}{2}\right)^2 = \theta^2 - \theta + \frac{1}{4}.$$ The asymptotic MSE for the posterior mean is lower than the asymptotic MSE for the MLE.