Visualizing Lagrange multipliers

643 Views Asked by At

Sorry if this seems like a very basic question but I am having trouble visualizing Lagrange multipliers. Particularly the equation:

$ \nabla f = \lambda * \nabla g $

f = function to maximise. g = constraint.

I don't understand why equating the gradients in such a way produces the extremum. I watched a Khan Academy video and the explanation was as follows:

Plotting Contours of f and g

My question is: Why does the extremum occur only where the contours of f touch the constraint at one point? Why can they not occur where there are multiple points? For example:

Ex: Contours of f meeting g at two points

Also, why is it that equating the gradients in such a way produces the point where they touch at only one point? My understanding is $ \nabla $f is the gradient vector of f and $ \nabla $g is the gradient vector for g. It seems to be there may be infinitely many points which may satisfy the equation but are not necessarily the extremum.

Kindly help me visualise what is going on here. Thanks in advance.

EDIT: My understanding of gradients

3

There are 3 best solutions below

0
On

Suppose $x_0$ is an extremum, then the constraint contour $g(x) = 0$ and cost contour $f(x)=f(x_0)$ pass through the point $x_0$.

If the contours are not 'parallel' at $x_0$, that is suppose they cross. Then you can see that there are points on the contour $g(x) = 0$ that have higher and lower values of $f$, which contradicts $x_0$ being an extremum.

So, the contours must be 'parallel' at $x_0$ which means the gradients must be collinear at $x_0$.

0
On

I was thrown off by this Medium article: https://medium.com/@andrew.chamberlain/a-simple-explanation-of-why-lagrange-multipliers-works-253e2cdcbf74

I had been incorrectly visualising the gradient of a 2D function in 3D.

Although most of the content is correct, the diagram is slightly misleading in the representation of the gradients. Had to revisit the definition of gradients which I did through this post: What is a Gradient?. Finally, by piecing everything together what I see is: for a function f(x1, x2) the gradient vectors must be "projected" onto the x1-x2 plane. On that plane, if $ \nabla f $ = $\lambda \nabla g $ (The vector are colinear) then we've reached a stationary point which may yield an extremum.

0
On

I think this answer puts it best: https://math.stackexchange.com/a/1290499/669259

$\nabla f$ is the direction you need to go (to increase the value of $f$), and $\nabla g$ is the direction you can't go (because that's the direction off the surface, and you need to stay on). When these point in the same direction, no possible movement makes any improvement.

On top of that, if you consider the Lagrangian, $$ \mathcal{L}(x, \lambda) = f(x) + \lambda g(x) \,,$$ the Lagrange multiplier tells you what is the cost/return of not satisfying the constraint, $g(x) = 0$, per unit of constraint.