MLE Normal Variance Changing Means

63 Views Asked by At

I have $n$ draws from a 2d normal distribution with a different mean vector every draw. The variance stays constant at $\tau I_2$. How can I get the MLE estimator for $\tau$? Since the mean is changing, it seems 0 or a very small $\epsilon$ is the best, but this doesn’t make much sense in a normal setup.

1

There are 1 best solutions below

0
On BEST ANSWER

The PDF of a multivariate normal distribution of dimension $k$ and mean vector $\mu$ and covariance matrix $\tau I$ is $$f_x(x | \mu, \tau) = (2 \pi)^{-k} \tau^{-k/2} \exp(-\frac{1}{2\tau}(x-\mu)^T(x-\mu))$$ Given $N$ observations of the distribution, the likelihood function could be written as $$L(\mu,\tau) = f_x(x_1,x_2\ldots x_N \vert \mu,\tau) $$ If the observations are independent then we get the following product $$L(\mu,\tau) = \Pi_{n=1}^N f_x(x_n\vert \mu,\tau) $$ which is $$L(\mu,\tau) = (2 \pi)^{-Nk} \tau^{-Nk/2} \exp(-\frac{1}{2\tau}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu)) $$ The maximum likelihood estimate of $\tau$ maximizes the above or any function that is increasing of the above, one of which is the log-likelihood, which is $$\ell (\mu,\tau) = \log L(\mu,\tau) = \log (2 \pi)^{-Nk} \tau^{-Nk/2} \exp(-\frac{1}{2\tau}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu)) $$ which is also $$\ell (\mu,\tau) = \log (2 \pi)^{-Nk} + \log \tau^{-Nk/2} + \log \exp(-\frac{1}{2\tau}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu)) $$ Using $\log \exp(x) = x$ we get $$\ell (\mu,\tau) = \log (2 \pi)^{-Nk} + \log \tau^{-Nk/2} -\frac{1}{2\tau}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu) $$ Using $\log x^a = a \log x$, we get $$\ell (\mu,\tau) = -Nk\log (2 \pi) - \frac{Nk}{2} \log \tau -\frac{1}{2\tau}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu) $$ Deriving with respect to $\tau$, $$\frac{\partial}{\partial \tau}\ell (\mu,\tau) = - \frac{Nk}{2\tau} + \frac{1}{2\tau^2}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu) = 0 $$ which gives you $$\hat{\tau} = \frac{1}{Nk}\sum\limits_{n=1}^N (x_n-\mu)^T(x_n-\mu) \tag{1}$$ But we do not know the mean $\mu$, and hence it should be estimated using the same process, i.e. $$\frac{\partial}{\partial \mu}\ell (\mu,\tau) = - \frac{1}{2\tau^2}\sum\limits_{n=1}^N (-2 x_n + 2 \mu ) = 0 $$ which gives us $$\hat{\mu} = \frac{1}{N}\sum\limits_{n=1}^N x_n \tag{2} $$ Replacing (2) in (1) we get $$\hat{\tau} = \frac{1}{Nk}\sum\limits_{n=1}^N (x_n-\frac{1}{N}\sum\limits_{l=1}^N x_l)^T(x_n-\frac{1}{N}\sum\limits_{l=1}^N x_l)$$