Mean squared error calculation

363 Views Asked by At

If $ X_1,...,X_n$ ~ $N(\mu, \sigma^2)$ where $\mu$ is known and $\sigma^2$ is unknown, calculate the MSE of $V^2$

$V^2 = \frac1n \sum_{X_i}^n Var(X_i) =\sigma^2$

Therefore:

$MSE(V^2) = Var(V^2) = \frac{1}{n^2}nVar[(X_1-\mu)^2]=\frac{1}{n}Var[\sigma^2(\frac{X_1-\mu}{\sigma})^2]=\frac{1}{n}\sigma^4Var[(\frac{X_1-\mu}{\sigma})^2]=\frac{2\sigma^4}{n}$

However, I do not understand some of the steps:

  1. Where does the $X_1$ suddenly come from (instead of$ X_i$)?
  2. And then in the next step, I am aware it has somehing to do with the fact that $\frac{X-\mu}{\sigma}$ ~ $\chi^2_1$ But i cannot connect the dots .

Could someone break these down for me ? I do not have an mathematical background, therefore stating the obvious is very welcome.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $X_1, X_2, \dots X_n$ be a random sample from $\mathsf{Norm}(\mu, \sigma),$ where $\mu$ is known and $\sigma^2$ is to be estimated by $V = \frac 1 n\sum_{i=1}^n (X_i - \mu)^2.$ (Note the use of the known population mean $\mu,$ not the sample mean $\bar X.)$ You want to evaluate $MSE(V).$ @Michael and I have given you some hints. (Notice that my $V$ is your $V^2$ to simplify notation a bit.)

With that orientation, I hope the following example with specific numbers for the quantities involved will help you do the required general derivation.

Suppose $n = 5,\, \mu = 0$ and $\sigma = 4.$ Then $Q = \frac{nV}{\sigma^2} = \frac{5V}{16} \sim \mathsf{Chisq}(n=5),$ which has mean $n=5$ and variance $2n=10.$ So $E(V) = \frac{\sigma^2}{n}n = 16\,$ (showing that $V$ is unbiased for $\sigma^2)$ and $Var(V) = MSE(V) = \frac{\sigma^4}{n^2}2n = 102.4.$

The following demonstration, using R statistical software, with a million such samples of size $n=5$ illustrates these numerical results to several significant digits. In the program MAT is a $10^6 \times 5$ matrix, in which each row is a sample of size $5.$

set.seed(715)  # retain for exactly same simulation, delete for fresh run
m = 10^6; n = 5; mu = 0; sg = 4
x = rnorm(m*n, mu, sg);  MAT = matrix(x, nrow = m)
v = rowMeans((MAT - mu)^2)  # using 'known' population mean, not sample mean
mean(v);  mean((v-sg^2)^2)
[1] 15.99998    # aprx E(V) = 16
[1] 102.5       # aprs MSE(V) = 102.4

The plot below shows the simulated distribution of $Q = \frac{nV}{\sigma^2} = \frac{5V}{16} = 0.3125V$ along with the density curve of $\mathsf{Chisq}(5).$

hist(5*v/sg^2, prob=T, br=40, xlab="q", col="skyblue2", main="")
  curve(dchisq(x, 5), add=T, lwd=2, n=1001)

enter image description here