Biased estimator of the Variance of a Gaussian Distribution

2.3k Views Asked by At

I'm reading the Deep Learning book and I've encountered a section that is difficult for me to understand. Specifically transformation from equation (5.38) to (5.39) $$ \mathbb{E}\left[\hat{\sigma}^2_m\right]=\mathbb{E}\left[\frac{1}{m}\sum_{i=1}^{m}\left( x^{(i)} - \hat{\mu}_m \right)^2 \right] \\ = \frac{m - 1}{m}\sigma^2 $$

Can anyone explain me how we got to the $\frac{m - 1}{m}\sigma^2$?

2

There are 2 best solutions below

1
On BEST ANSWER

One way to get the equality is to evaluate the expected value of the unbiased estimator $s^2$. I use n instead of m and $ \overline X$ for the (unbiased) estimator of $\mu$.

$E(s^2)=E\left[\frac{1}{n-1}\sum_{i=1}^n (X_i-\overline X )^2\right]$

$=\frac{1}{n-1}E\left[\sum_{i=1}^n (X_i-\overline X)^2 \right] \quad | \pm \mu$

$=\frac{1}{n-1}E\left[\sum_{i=1}^n \left[(X_i-\mu)-(\overline X-\mu) \right]^2 \right] \quad$

multipliying out

$=\frac{1}{n-1}E\left[\sum_{i=1}^n \left[(X_i-\mu)^2-2(\overline X-\mu)(X_i-\mu)+(\overline X-\mu)^2 \right]\right] \quad$

writing for each summand a sigma sign

$=\frac{1}{n-1}E\left[\sum_{i=1}^n (X_i-\mu)^2-2(\overline X-\mu)\sum_{i=1}^n(X_i-\mu)+\sum_{i=1}^n(\overline X-\mu)^2 \right] \quad$

$=\frac{1}{n-1}E\left[\sum_{i=1}^n (X_i-\mu)^2-2(\overline X-\mu)\color{blue}{\sum_{i=1}^n(X_i-\mu)}+n(\overline X-\mu)^2 \right] \quad$


transforming the blue term

$\sum_{i=1}^n(X_i-\mu)=n\cdot \overline X-n\cdot \mu$

Thus $2(\overline X-\mu)\color{blue}{\sum_{i=1}^n(X_i-\mu)}=2(\overline X-\mu)\cdot (n\cdot \overline X-n\cdot \mu)=2n( \overline X- \mu)^2$


$=\frac{1}{n-1}E\left[\sum_{i=1}^n (X_i-\mu)^2-2n( \overline X- \mu)^2+n(\overline X-\mu)^2 \right] \quad$

$=\frac{1}{n-1}E\left[\sum_{i=1}^n (X_i-\mu)^2-n( \overline X- \mu)^2\right] \quad$

$=\frac{1}{n-1}\left[\sum_{i=1}^n E\left[(X_i-\mu)^2\right]-nE\left[( \overline X- \mu)^2\right]\right] \quad$

We know, that $E\left[(X_i-\mu)^2\right]=\sigma^2$ and $E\left[( \overline X- \mu)^2\right]=\sigma_{\overline x}^2=\frac{\sigma^2}{n}$ Thus we get

$=\frac{1}{n-1}\left[n \cdot \sigma ^2-n \frac{\sigma ^2}{n}\right]=\frac{1}{n-1} \sigma^2 \cdot (n-1)=\boxed{\sigma ^2=E(s^2)}$

If you would calculate $E(\hat \sigma)= E\left[\frac{1}{n}\sum_{i=1}^n (X_i-\overline X )^2\right]$ the result is almost the same but you have to multiply the result by n to neutralize $\frac{1}{n}$ and divide it by $\frac1{n-1}$

$E(\hat \sigma)=\frac{n}{n-1}\cdot \sigma^2$

5
On

If you expand the square you have \begin{align*} \mathbb{E} \frac{1}{m} \sum_i \left( x_i^2 - 2 x_i \hat{\mu} + \hat{\mu}^2 \right) &= \frac{1}{m} \mathbb{E} \sum_i x_i^2 - \mathbb{E} \left[ 2\hat{\mu} \frac{1}{m}\sum_i x_i + \hat{\mu}^2 \right] \\ &= \mathbb{E} \left[ X^2 \right] - \mathbb{E} \left[ \hat{\mu}^2 \right], \end{align*} Now we have \begin{align*} \left( \sum_i x_i \right)^2 = m \sum_i x_i^2 - \frac{1}{2}\sum_i \sum_j (x_i - x_j)^2 \end{align*} Taking care of those points where $i = j$ gives \begin{align*} \mathbb{E}\left[ \left( \sum_i x_i \right)^2\right] &= m^2 \mathbb{E}\left[ X^2 \right] - \frac{1}{2}\sum_i \sum_{j \neq i} \mathbb{E} \left( x_i^2 - 2x_i x_j + x_j^2 \right) \\ &= m^2 \mathbb{E}\left[ X^2 \right] - m(m-1)\left( \mathbb{E}[X^2] - \mathbb{E}[X]^2 \right) \\ &= m^2 \mathbb{E}[X^2] - m(m-1)\sigma^2, \end{align*} so that after dividing through by $m^2$ we get \begin{align*} \mathbb{E} \left[ X^2 \right] - \mathbb{E}\left[ \hat{\mu}^2 \right] = \frac{(m-1)}{m}\sigma^2 \end{align*}