Why does setting the derivative of a likelihood function equal to 0 maximize the likelihood function?

1.1k Views Asked by At

I'm learning from a statistics tutorial which defines a likelihood function as

\begin{align} L(1,3,2,2; \theta)=27 \cdot \theta^{8} (1-\theta)^{4} \tag{1} \end{align}

and then the tutorial sets the derivative of (1) to zero to find the value of $\theta$ that maximizes the likelihood function.

I understand where this formula comes from.

\begin{align} \frac{\text d L(1,3,2,2; \theta)}{\text d\theta}= 27 \big[8\theta^{7} (1-\theta)^{4}-4\theta^{8} (1-\theta)^{3} \big] \tag{2} \end{align}

I don't understand how to determine if setting (2) to 0 produces a maximum or minimum.

Per another tutorial, we could use the second derivative of the function to determine if it is a maximum or minimum.

Here is the second derivative of the likelihood function (1)

$4\left(\theta-1\right)^2\theta^6\left(33\theta^2-44\theta+14\right) \tag{3}$

Setting (2) to zero and simplifying it gives

$2-3\theta = 0 \tag{4}$

How do I use (3) to determine if it is a maximum or minimum?

Should I set (3) to zero and simplify it the same way to get (4)?

Any other method is also welcomed.

2

There are 2 best solutions below

0
On BEST ANSWER

Xi'an is obviously right, but maybe a less formal description can help develop your intuition.

You have one parameter, so you could create a graph with the parameter on the x-axis and the likelihood on the y-axis. The first derivative is the slope of that curve at that parameter value. If the derivative is positive, then the curve is upward sloping so there is a higher likelihood somewhere to the right and the current parameter value cannot be the maximum. Similarly, if the derivative is negative, then the curve is downward sloping so there is a higher likelihood somewhere to the left and the current parameter value cannot be the maximum. Only when the derivative is zero can that parameter value result in a maximum likelihood value. So that is a necessary condition.

However that value could also be a minimum. To distinguish between those we look at the second derivative. This tells us how the slope changes. For a maximum we start with a positive slope, which decreases as we move to right. So the second derivative is negative for a maximum. For a minimum we start with a negative slope, which becomes less negative (i.e. increases) as we move to the right. So the second derivative is positive for a minimum.

0
On

To quote from Wikipedia:

One way to state Fermat's theorem is that, if a function has a local extremum at some point and is differentiable there, then the function's derivative at that point must be zero. In precise mathematical language:

Let $$f\colon (a,b) \rightarrow \mathbb{R}$$be a function and suppose that $x_0 \in (a,b)$ is a point where f has a local extremum. If $f$ is differentiable at $x_0$ then$$f'(x_0) = 0$$

and

After establishing the critical points of a function, the second-derivative test uses the value of the second derivative at those points to determine whether such points are a local maximum or a local minimum. If the function $f$ is twice-differentiable at a critical point $x$ (i.e. a point where $f'(x) = 0)$ then:

  • If $f''(x) < 0$, then $f$ has a local maximum at $x$.
  • If $f''(x) > 0$, then $f$ has a local minimum at $x$.
  • If $f''(x) = 0$, the test is inconclusive.