Lagrange multiplier: How to solve system of equations

317 Views Asked by At

In the answer to this question: Maximum likelihood estimator of categorical distribution we are looking for to satisfy these conditions:

$$ \theta_1+\theta_2+\theta_3 = 1,\tag 0 $$

$$ (1,1,1) = \lambda \left( \frac{x_1}{\theta_1}, \frac{x_2}{\theta_2}, \frac{x_3}{\theta_3} \right) \tag 1 $$

While I understand why the mentioned result is correct, how do I arrive to it analytically?

1

There are 1 best solutions below

0
On

The likelihood function is $\theta_1^{x_1} \theta_2^{x_2} \theta_3^{x_3}$ (up to a multiplicative constant that doesn't depend on $\theta$). The log likelihood is $x_1 \log(\theta_1) + x_2 \log(\theta_2) + x_3 \log(\theta_3)$ (up to an additive constant that doesn't depend on $\theta$).

There is a constraint that $\theta_1+\theta_2+\theta_3=1$ from the properties of the multinomial distribution. You could handle that by substituting one of the $\theta$'s in terms of the other two, but you can instead handle it by using Lagrange multipliers, which gives you the three differential equations

$$\frac{\partial \log(L)}{\partial \theta_i} = \frac{x_i}{\theta_i} = \lambda \frac{\partial(\theta_1+\theta_2+\theta_3)}{\partial \theta_i} = \lambda$$

This is the Lagrange condition in the form I usually see it in multivariable calculus, $\nabla f = \lambda \nabla g$ where $f$ is the objective function and $g$ is the constraint function. The equation they wrote is just this one with $\lambda$ moved to the other side essentially, in other words their $\lambda$ is my $\lambda^{-1}$. Either way, you get that $\theta_i=\frac{x_i}{\lambda}$, then plug into the constraint and conclude $\lambda=x_1+x_2+x_3$.