Show that $E\left[(\hat{\mu}[n+1]-m)^2 \right] \le E\left[(\hat{\mu}[c]-m)^2 \right] $ (minimizing expected MSE)

53 Views Asked by At

Give a sample $(X_1,X_2,\ldots, X_n)$ where $X_i$'s are i.i.d exponential random variables. with parameter $\frac{1}{m}$ (mean $m$). Let

$$\mathcal T=\{\hat{\mu}[c] \, \vert\, c>0\} \hspace{0.5cm} \text{and} \hspace{0.5cm} \hat{\mu}\,[c]=\frac{1}{c}\sum_{i=1}^n X_i$$.

Show that the estimator $\hat{\mu}[n+1]$ has the smallest expected MSE. In other words, I want to show that:

$$E\left[(\hat{\mu}[n+1]-m)^2 \right] \le E\left[(\hat{\mu}[c]-m)^2 \right] $$

What I have tried so far:

$$\begin{aligned}E\left[(\hat{\mu}[n+1]-m)^2 \right] &=\big(E\left[\hat{\mu}[n+1]-m \right]\big)^2+\text{Var}(\hat{\mu}[n+1]) \\[5pt] &=\big(E\left[\hat{\mu}[n+1]\right]-E[m] \big)^2+\text{Var}(\hat{\mu}[n+1]) \\[5pt] &= \left(E\left[\frac{1}{n+1}\sum_{i=1}^n X_i\right]-m \right)^2+ \text{Var}(\hat{\mu}[n+1]) \\[5pt] &= \left(\frac{1}{n+1}\sum_{i=1}^n E[X_i]-m \right)^2+ \text{Var}(\hat{\mu}[n+1]) \\[5pt] &= \left( \frac{nm}{n+1}-m\right)^2+\text{Var}(\hat{\mu}[n+1] \\[5pt] &= \frac{m^2}{(n+1)^2} +\text{Var}(\hat{\mu}[n+1] \le\ldots?\end{aligned}$$

I was trying to get to an expression that I could show is minimized for $c=n+1$ but I am not able to rewrite the variance and I am not sure if this is the right approach. Alternatively I was thinking of trying to show that $c=n+1$ minimizes the expected MSW by doing some sort of differentiation but that didn't work either. How can I show this statement is true?


Edit: Based on the answer from heropup, I calculated the minimum and bias.

Minimizing the expression:

$$\begin{equation*} \begin{split} &\phantom{\iff} \, \, \,\frac{d}{dc}\left(\frac{m^2(c-n)^2+m^2n}{c^2}\right) = 0 \\[10pt] &\iff \frac{2m^2(c-n)-2c(m^2(c-n)^2+m^2n}{c^4}=0 \\[10pt] &\iff 2(c-n)c^2-2c(m^2(c-n)^2+m^2n)=0 \\[10pt] &\iff 2nc^2-2n^2c-2nc=0 \\[10pt] &\iff c(c-n-1)=0 \\[10pt] &\iff c=n+1 \end{split} \end{equation*} $$

We also have $$\frac{d^2}{dc^2}\left(\frac{m^2(c-n)^2+m^2n}{c^2}\right)=\frac{2m^2 n(3n-2x+3)}{x^4}$$

and at the point $n+1$ the second derivative is positive since $n,m>0$ adn therefore:

$$\frac{2m^2 n(n+1)}{(n+1)^4}>0$$

Therefore the expected MSE is minimized for $c=n+1$ and the inequality holds.

For the bias:

$$\begin{equation*} \begin{split} E\left[\hat{\mu}[n+1]\right]-m &= E\left[ \frac{1}{n+1} \sum_{i=1}^n\right]-m \\[10pt] &= \frac{1}{n+1}\sum_{i=1}^n E[X_i]-m \\[10pt] &= \frac{1}{n+1} nm-m \\[10pt] &= m\left(\frac{n}{n+1}-1 \right) \not= m \end{split} \end{equation*}$$

Therefore this estimator is biased with the bias calculated above.

1

There are 1 best solutions below

3
On BEST ANSWER

Let $$Y_n = \sum_{i=1}^n X_i = c \hat \mu[c]$$ be the sample total, which is not dependent on the choice of $c$. What is the mean and variance of this random variable? Since the $X_i$ are iid exponential, it is easy to see $$\operatorname{E}[Y_n] = n \operatorname{E}[X_1] = nm, \quad \operatorname{Var}[Y_n] = n \operatorname{Var}[X_1] = nm^2.$$ Then the bias of the estimator $\hat \mu[c]$ is simply $$\operatorname{Bias}[\hat \mu[c]] = \operatorname{E}[\hat \mu[c] - m] = \frac{nm}{c} - m = \left(\frac{n}{c} - 1 \right) m$$ and the variance of the estimator is $$\operatorname{Var}[\hat \mu[c]] = \operatorname{Var}[Y_n/c] = \frac{n}{c^2} m^2.$$ So the MSE is $$\operatorname{MSE}[\hat \mu[c]] = \operatorname{Bias}^2[\hat \mu[c]] + \operatorname{Var}[\hat \mu[c]] = \left(\left(\frac{n}{c} - 1\right)^2 + \frac{n}{c^2} \right) m^2 = \frac{(c-n)^2 + n}{c^2} m^2.$$

For what value of $c$ is this function minimized? We can clearly ignore $m^2$, so treat this as a continuous function of $c$ for fixed $m, n$, differentiate with respect to $c$, locate its critical point(s), and the rest is obvious.


The key strategic concept that you overlooked here is that you sought to compute the MSE for the specific choice that already minimizes it, namely $c = n+1$, rather than to compute it for a general value of $c$ and using calculus to find the global minimum. This latter approach, as you can see, is much easier.

As a further exercise, note that this choice for $c$ does not result in an unbiased estimator. What is the bias? How does this example illustrate the concept of "variance-bias tradeoff" with respect to parameter estimation?