Expected value of function of vectors from the p-variate normal distribution - Data Analysis Sample Covariance

43 Views Asked by At

I am proving that the sample covariance matrix $\mathbf{S}=\frac{1}{n-1}\mathbf{X}^T\mathbf{X}$ is an unbiased estimator for the covariance matrix $\mathbf{\Sigma}$, where $\mathbf{X}$ is a mean centred $n$x$p$ data matrix. Note that $\mathbf{\hat\mu}=\frac{1}{n}\sum_1^n\mathbf{x}_i$ is the MLE for $\mathbf{\mu}$, considering the p-variate normal distribution $N_p(\mathbf{\mu},\mathbf{\Sigma})$. This is what I have so far:

$$\mathbb{E}[\mathbf{S}]= \mathbb{E}[\frac{1}{n-1}\mathbf{X}^T\mathbf{X}]= \frac{1}{n-1}\mathbb{E}[ \sum_1^n (\mathbf{x_i}-\mathbf{\hat\mu})(\mathbf{x_i}-\mathbf{\hat\mu)^T}]= \frac{1}{n-1}\mathbb{E}[ \sum_1^n (\mathbf{x}_i\mathbf{x}_i^T-2\mathbf{x}_i\mathbf{\hat\mu}+\mathbf{\hat\mu}\mathbf{\hat\mu}^T)]= $$ $$\frac{1}{n-1}\sum_1^n (\mathbb{E}[\mathbf{x}_i\mathbf{x}_i^T]-\mathbb{E}[\mathbf{\hat\mu}\mathbf{\hat\mu}^T]) $$ but here is where I am unsure how to proceed.

I know that $\mathbf{x}_i$~$N_p(\mathbf{\mu},\mathbf{\Sigma})$ gives $\mathbf{\hat\mu}$~$N_p(\mathbf{\mu},\frac{\mathbf{\Sigma}}{n})$ but how does this give an expected value for $\mathbf{\hat\mu}\mathbf{\hat\mu}^T$ or $\mathbf{x}_i\mathbf{x}_i^T$ in terms of the covariance matrix?

2

There are 2 best solutions below

0
On BEST ANSWER

Remember the identity $var(X) = \mathbb{E}(X^2) - \mathbb{E}(X)^2$.

It may help to notice that

$\mathbb{E}[x_ix_i^T] = \mathbb{E}[x^2]$. Also expand out $\mathbb{E}[\hat{\mu}^T\hat{\mu}]$ using the definition $\hat{\mu}$ and the fact that $X$ is mean centered

0
On

\begin{align} (x_i - \hat{\mu})(x_i - \hat{\mu})^\top &= (x_i - \mu)(x_i-\mu)^\top + (x_i-\mu)(\mu - \hat{\mu})^\top + (\mu -\hat{\mu})(x_i-\mu)^\top + (\hat{\mu} - \mu)(\hat{\mu}-\mu)^\top. \end{align}

  • The expectation of the first term is $\Sigma$.
  • The expectation of the last term is $\Sigma/n$ (since $\hat{\mu}$ has mean $\mu$ and covariance $\Sigma/n$, as you noted).

The second term is $$ \begin{align} &(x_i - \mu)(\mu-\hat{\mu})^\top \\ &= \frac{1}{n} \sum_{j=1}^n (x_i-\mu)(\mu-x_j)^\top \\ &= -\frac{1}{n} (x_i - \mu)(x_i-\mu)^\top - \frac{1}{n} \sum_{j\ne i} (x_i-\mu)(x_j-\mu)^\top. \end{align}$$ which has expectation $-\frac{1}{n} \Sigma$ because $x_i$ and $x_j$ are independent when $i \ne j$.

The third term can be handled similarly.

Combining everything together, we have $$E[(x_i - \hat{\mu})(x_i - \hat{\mu})^\top] = \Sigma - \frac{2}{n} \Sigma + \frac{1}{n} \Sigma = \frac{n-1}{n} \Sigma.$$

Summing over $i$ and dividing by $n-1$ yields $\Sigma$.