Why does the method of Lagrange Multipliers fail when $\nabla g =0$?

1.3k Views Asked by At

I know that one of the preconditions to find the extrema of $f(x,y)$ subject to $g(x,y)=0$ using the method of Lagrange Multipliers is that $\nabla g \neq 0$. I do understand that if $\nabla g$ does equal zero, we will have to equate ($(0,0)$ with some other finite ordered pair representing the gradient of $f$ and we will end up missing an extremum.
I am looking for a more intuitive reason for why this is a condition. For instance, why does it graphically mean that we will end up skipping an extremum? Why can't we say that, well, since we cannot equate $(0,0)$ with some $(a,b): a,b\neq0$, such an extremum doesn't exist at all?

1

There are 1 best solutions below

1
On

Think about the constraint $g$ geometrically: the level sets $g(x,y)=c$ for different values of $c$ foliate the plane and for almost all values of $c$ (the ones for which $c$ is a regular value; i.e., $\nabla g(x,y) \neq (0,0)$ for any point $x,y$ with $g(x,y)=c$), the level set is a smooth curve with a well-defined normal direction $\nabla g$ at each point.

Now when you optimize $f$ subject to $g=0$, you are allowed to slide along the level set $g=0$ but not move off of this curve. Thus you are at a locally optimal point if either of the following two conditions are true:

  • you are at a critical point of $f$ that just happens to be located on the isoline $g=0$;
  • you are not at a critical point of $f$, but sliding left or right along the curve $g=0$ does not improve your objective function to first order. This will happen if $\nabla f$ is parallel to the normal direction to the curve $\nabla g$.

And it happens that the pair of equations \begin{align*} \nabla f(x,y) - \lambda \nabla g(x,y) &= 0\\ g(x,y) &= 0 \end{align*} exactly captures both of these conditions.

The method of Lagrange multipliers is only guaranteed to work when zero is a regular value of $g$ because this is the condition that guarantees that $g=0$ is a smooth curve. Otherwise, there can be all kinds of singularities in the level set $g^{-1}(0)$: isolated points, cusps, X-shaped crossings, etc. If $g=0$ is not a smooth curve, the above logic breaks down: you can't talk about "sliding left or right" along the curve and cannot characterize local optimality in terms of $\nabla f$ being parallel to $\nabla g$.