How would I show this turning point indeed has the second derivative $<0$. Showing the turning point is indeed the Maximum Likelihood Estimator

124 Views Asked by At

This is a question from Statistics but it really boils down to some algebra that I cannot seem to get my head around.

Consider $(X_i)_{i=1,2,...,n}$ iid such that each is $N(u, u^2)$. Then the log-likelihood is given as (I have thrown away some constants):

$l(u)=-\frac{n}{2}log(u^2)-\frac{1}{2}u^{-2}\sum_{i}^{n}(X_i^2-2uX_i)$.

$l'(u)=-nu^{-1}+u^{-3}\sum(X_i^2)-u^{-2}\sum X_i$.

So by setting the above derivative to be $0$, I got two solutions: (By solving the quadratic $nu^2-(\sum X_i^2)+u(\sum X_i)=0)$

$\hat u=\frac{1}{2}(-\bar x±\sqrt{(\bar x)^2+\frac{4\sum X_i^2}{n}})$.

Now I was wondering how to find which one is meant to be the MLE, I mean I can try to find $l''(\hat u)$ for each case and see if that is negative but the algebra becomes quite messy. Here is my attempt though:

$l''(u)=nu^{-2}-3u^{-4}\sum(X_i^2)+2u^{-3}(\sum X_i)\Longrightarrow u^4l''(u)=nu^2-3\sum(X_i^2)+2u(\sum X_i)$.

Evaluated at $\hat u^2$, we would have $\hat u(\sum X_i)-2\sum(X_i)^2$ but how would we determine what sign this would have?

Many thanks in advance!

2

There are 2 best solutions below

0
On BEST ANSWER

It is not true that we must have $u > 0$. For example, if $u = -1$, the distribution $$\operatorname{Normal}(\mu = -1, \sigma^2 = (-1)^2 = 1)$$ is a perfectly valid one, and we can generate realizations from such a distribution and the resulting MLE calculation remains valid up to the point of selecting the correct critical point.

At this stage, you have identified two local maxima corresponding to the positive and negative branches of the square root. Both will also pass the second derivative test, because they are indeed local maxima, and there is no local minimum in between them because the log-likelihood is $-\infty$ at $0$.

So which one is the MLE? The choice is obvious: you pick the sign corresponding to the one that matches the sign of the sample mean $\bar x$. Think about this. When a sample of sufficiently large size is drawn from a normal distribution with mean $u$, the tendency of the sample mean is to be "close" to the true mean $u$. Therefore, when $\bar x > 0$, you choose the positive root. When $\bar x < 0$, you choose the negative root.

What happens if you observe $\bar x = 0$? Then both critical points are MLEs. Recall that the MLE is not necessarily uniquely defined. In such a case, either choice results in a maximal likelihood. The intuitive explanation is that there is no information about the sign of $u$ contained in the sample; therefore, a positive estimate is equally as valid as a negative one.


Since more than one comment has made an incorrect assumption about $u$, I would like to invite the reader to perform the following exercise.

Suppose our sample is $$\boldsymbol x = \{-1, -1, 0, 2, 3, x_6\}.$$ Thus the log-likelihood and the two critical points can be computed as a univariate function of the last observation $x_6$. Plot the log-likelihood for the three cases $x_6 = -9, -3$, and $3$, respectively and note the location of the critical points. What do you see? This illustrates why the restriction $u > 0$ is completely unnecessary and improperly eliminates an entire subset of distributions from consideration.

3
On

Observe the term $\log u$, suggesting that $u\gt 0$. You must take the positive sign.