Can anyone help me understand Lagrange Multipliers?

1.6k Views Asked by At

I'm currently trying to understand the method of Lagrange Multipliers. The explanation I'm currently looking at says something along the lines of

"Suppose we wish to minimise the function $f(x,y)$ subject to the constraint $g(x,y)=0$, and that this minimum is the point $(x_{0}, y_{0})$. Then $\nabla f(x_{0}, y_{0})$ is the normal to the function $f$ at this point. Furthermore, the normal vectors of $f$ and $g$ are are parallel. Thus, $\nabla f(x_{0}, y_{0})=\lambda \nabla g(x_{0}, y_{0})$."

(Source: http://www.slimy.com/~steuard/teaching/tutorials/Lagrange.html)

I really don't understand why the normal vectors of $f$ and $g$ are are parallel, or how this gives rise to the equation $\nabla f(x_{0}, y_{0})=\lambda \nabla g(x_{0}, y_{0})$.

Could someone please explain this to me?

Many thanks.

2

There are 2 best solutions below

0
On

I’ve always thought of it in terms of differentials and tangent vectors instead of gradients. If $f(\mathbf P)$ has an extremum at the point $\mathbf P_0$ then $\mathrm df_{\mathbf P_0}(\mathbf v)=0$ for any vector $\mathbf v$ that’s tangent to the curve $g(\mathbf P)=\text{constant}$. This is a level curve of $g$, so $\mathbf v$ also satisfies $\mathrm dg(\mathbf v)=0$. That means that at $\mathbf P_0$, $\mathrm df_{\mathbf P_0}$ must be a multiple of $\mathrm dg_{\mathbf P_0}$, say $\mathrm df=\lambda\mathrm dg$.

This translates pretty directly to gradients, which are orthogonal to tangents. The gradient of a function is always normal to its level curves, so $\nabla g$ is everywhere normal to the constraint curve $g(x,y)=0$. Now, the gradient of $f$ at a point gives the direction of fastest increase. The amount of change in other directions is $\nabla f\cdot\mathbf u$, where $\mathbf u$ is a unit vector that specifies the direction. That dot product is $0$ when $\mathbf u$ is orthogonal to $\nabla f$. So, for $f$’s value along some curve to be stationary (i.e., have a local extremum) at some point, $\nabla f$ must be normal to the curve there, i.e., parallel to $\nabla g$. This means that it must be some scalar multiple of $g$’s gradient, i.e., $\nabla f=\lambda\nabla g$.

2
On

$\vec \nabla f$ is not normal to the function $f$ , it is normal to a curve of constant $f$ -

more precisely $\vec \nabla f(x_0, y_0)$ is perpendicular to the tangent line to the curve defined by $f(x_0, y_0)=\text{Const.}$ at the point $(x_0, y_0)$

Lagrange multipliers deals with the situation in which you want to find the extrema of a function $f$ constrained to a curve of constant $g$

When you are constraint to a curve of constant $g$ you must always move in the direction of the tangent to the curve $g=$Const. - which is perpendicular to $\vec \nabla g$

To have a constrained extrema it must be true that the directional derivative of $f$ in the direction of the tangent plane to the constraining surface must vanish. This means that $\vec \nabla f$ must also be perpendicular to the curve $g=$Const.

Since both $\vec \nabla f$ and $\vec \nabla g$ are perpendicular to the same line, they must be parallel to each other

So $\vec \nabla f = \lambda \vec \nabla g$