Given a p-dimensional multivariate normal distribution, I want to derive the maximized likelihood:
$$L(\hat{\mu}, \hat{\Sigma} ; X_{1}, ... , X_{n}) = (2\pi)^{-np/2}\left | \Sigma \right |^{-n/2}e^{-np/2} $$
Start with MLE first step, the product of n p-dimensional normal PDFs:
$$ L(\hat{\mu}, \hat{\Sigma} ; X_{1}, ... , X_{n}) = \prod_{i=1}^{n}(2\pi)^{-p/2}\left | \Sigma \right |^{-1/2}e^{-1/2(\vec{x}_{i}-\vec{\mu})^{T}\Sigma^{-1}(\vec{x}_{i}-\vec{\mu})} $$
Simplify the product:
$$ L(\hat{\mu}, \hat{\Sigma} ; X_{1}, ... , X_{n}) = (2\pi)^{-np/2}\left | \Sigma \right |^{-n/2}e^{-1/2 \Sigma_{i=1}^{n}(\vec{x}_{i}-\vec{\mu})^{T}\Sigma^{-1}(\vec{x}_{i}-\vec{\mu})} $$
And then I cannot seem to connect the term in the exponential with np in the "solution". Can someone give me a hint?
Solution
The trick is in a decomposition of the sum in the exponential with the theorem:
For an n by 1 vector $\vec{a}$ and an n by n symmetrical matrix A, it follows that $$\vec{a}^{T}A\vec{a} = trace(\vec{a}A\vec{a}^{T}) = trace(A\vec{a}\vec{a}^{T})$$ Hence $$\Sigma_{i=1}^{n}(\vec{x}_{i}-\vec{\mu})^{T}\Sigma^{-1}(\vec{x}_{i}-\vec{\mu}) = trace(\Sigma^{-1}\Sigma_{i=1}^{n}(\vec{x}_{i}-\vec{\mu})(\vec{x}_{i}-\vec{\mu})^{T})$$
Now an assumption follows that the variance matrix $\Sigma$ can be estimated by: $$\Sigma = \frac{1}{n}\Sigma_{i=1}^{n}(\vec{x}_{i}-\vec{\mu})(\vec{x}_{i}-\vec{\mu})^{T}$$
Therefore
$$ trace(\Sigma^{-1}\Sigma_{i=1}^{n}(\vec{x}_{i}-\vec{\mu})(\vec{x}_{i}-\vec{\mu})^{T}) = n trace(\Sigma^{-1}\Sigma) = ntrace(I) = np$$