Deriving the log-likelihood of a multivariate normal distribution by \mu

34 Views Asked by At

I have a sample from multiple groups, $k=\{1,2,..,K\}$. I need to find the maximum likelihood estimates for $\mu^{(k)}$ and for $V$. Looking at here I can see how the entire right part becomes $Nd; N=n_1+...+n_K$. However, it doesn't help me to find an exact value for the mle of $\mu^{(k)}$ or for $V$.

enter image description here

1

There are 1 best solutions below

0
On BEST ANSWER

I try to reformulate the problem to see if I understood it correctly.

You have $K$ samples such that $\mathbf{X^{(i)}} \sim \mathcal{N}_d (\mu^{(i)},V)$ (in particular your data is not identically distributed) and want to compute each mean and the common covariance matrix.

From independence of the sample it follows that the log-likelihood function is

$$ \begin{split} \ell(\mu^{(i)},V) &= \log \prod_{i=1}^{K} f_{\bf{X^{(i)}}} = \sum_{i=1}^{K}\bigg(-\frac{d}{2}\log(2\pi)- \frac{1}{2}\log|V|-\frac{1}{2}(x^i-\mu^i)^T V^{-1}(x^i-\mu^i)\bigg) \\ &= -\frac{Kd}{2}\log(2\pi)-\frac{K}{2}\log|V|-\frac{1}{2}\sum_{i=1}^{K}(x^i-\mu^i)^T V^{-1}(x^i-\mu^i) \end{split} $$

To compute the MLE for the means and covariance matrix we will differentiate and equate the derivatives to $0$.

Means

Recall thet for a symmetric matrix $A$ that does not depend on $z$, $\frac{\partial(\mathbf{z^TAz})}{\partial \mathbf{z}}=2\mathbf{Az}$. Hence

$$ \frac{\partial \ell}{\partial \mu^i} = 2 \sum_{i=1}^{K}V^{-1}(x^i-\mu^i) = 0 \Rightarrow \hat{\mu}^i = \frac{1}{K}\sum_{i=1}^{K} x^i $$

Covariance matrix

We will need several results from linear algebra:

  • $\text{Tr}(ABC)=\text{Tr}(CAB)=\text{Tr}(BCA)$
  • $z^TAz = \text{Tr}(z^TAz)=\text{Tr}(zz^TA)$
  • $\frac{\partial}{\partial A}\text{Tr}(AB)=B^T$
  • $\frac{\partial}{\partial A}\log|A|=(A^{-1})^T=(A^T)^{-1}$
  • $|A|=1/|A^{-1}|$

The log-likelihood function can then be rewritten as $$ \ell =-\frac{Kd}{2}\log(2\pi)+\frac{K}{2}\log|V^{-1}|-\frac{1}{2}\sum_{i=1}^{K}(x^i-\mu^i)(x^i-\mu^i)^TV^{-1} $$ By simmetry we end up with $$ \frac{\partial \ell}{\partial V^{-1}} = \frac{K}{2}V -\frac{1}{2}\sum_{i=1}^{K}(x^i-\mu^i)(x^i-\mu^i)^T = 0\\ \Longrightarrow \hat{V} = \frac{1}{K}\sum_{i=1}^{K}(x^i-\hat{\mu}^i)(x^i-\hat{\mu}^i)^T $$