In Lagrange Multiplier, why level curves of $f$ and $g$ are tangent to each other?

2.5k Views Asked by At

In Lagrange multiplier method, e.g. optimize a function $f(x_1, \dots, x_n)$ under a constraint $g(x_1, \dots, x_n) = 0$. There is a fact that $\nabla f$ is parallel to $\nabla g$ which is given rise from the level curves of $f$ and $g$ are tangent to each other (i.e. there tangent lines are parallel, then because gradient and tangent of level curve are orthogonal implies the fact above) at the points when $f$ optimized under constraint $g$.

The only part I don't have intuitive understanding is that why level curves of $f$ and $g$ are tangent to each other at where $f$ optimized under $g$.

4

There are 4 best solutions below

0
On

Parametrize the curve $g(x) = 0$ with $c(t)$ s.t $c(0) = p$ where $p$ is the local extrema of $f, c'(0) \not = 0$. Then you know that $f(c(t))$ has local min/max when $t = 0$ i.e;

$$\frac{d}{dt} f(c(t)) |_{t=0} = \nabla f(p) \cdot c'(0) = 0$$

We also know that $\nabla g(p) \cdot c'(0) = 0$ and so there exists a non-zero scalar $\lambda$ s.t;

$$\nabla f(p) = \lambda \nabla g(p)$$

0
On

The level curves of $f$ represent single values of $f$ that increase in a direction parallel to the gradient. This means that, given a level curve that does not represent a local maximum, there is another level curve nearby whose value for $f$ is greater than the first curve.

Imagine $g$ as a curve that cuts through a level curve of $f$ at a point $p$. Since $g$ cuts the curve, there are level curves of $f$ on either side of $p$ that also intersect with $g$. Therefore, we can choose another level curve with a greater value for $f$ than the one that contains $p$ and so the maximum cannot occur on that curve.

Therefore, to maximize $f$, we choose level curves in the direction of increase until we can go no further which will occur when the level curve of $f$ is tangent to $g$.

0
On

The thing is like in the following picture.

enter image description here

of wikipedia.

For $f=d$ you increment $d$ until you touch $g=c$. In the moment of contact you take a minimum. If you go on, just before $f=d$ leaves the contact, you take the maximum.

Thinking it well it is like parking! Really, the idea is so productive that is the base of the Morse's theory.

1
On

Assume you want to optimize the system $f(x_1,x_2)$ subject to $g(x_1,x_2)=c$. To assure that the constraing holds $$dg=\frac{\partial g}{\partial x_1}dx_1+\frac{\partial g}{\partial x_2}dx_2=0$$ $$\Rightarrow dx_2=-\frac{\frac{\partial g}{\partial x_1}}{\frac{\partial g}{\partial x_2}}dx_1$$ By definition $f$ is optimized when $$df=\frac{\partial f}{\partial x_1}dx_1+\frac{\partial f}{\partial x_2}dx_2=0$$ Replacing $dx_2$ $$df=\frac{\partial f}{\partial x_1}dx_1-\frac{\partial f}{\partial x_2}\frac{\frac{\partial g}{\partial x_1}}{\frac{\partial g}{\partial x_2}}dx_1=0$$ $$df=\bigg(\frac{\partial f}{\partial x_1}-\frac{\partial f}{\partial x_2}\frac{\frac{\partial g}{\partial x_1}}{\frac{\partial g}{\partial x_2}}\bigg)dx_1=0$$ $$\Rightarrow \frac{\partial f}{\partial x_1}-\frac{\partial f}{\partial x_2}\frac{\frac{\partial g}{\partial x_1}}{\frac{\partial g}{\partial x_2}}=0$$ $$\Rightarrow \frac{\frac{\partial f}{\partial x_1}}{\frac{\partial g}{\partial x_1}}=\frac{\frac{\partial f}{\partial x_2}}{\frac{\partial g}{\partial x_2}}=\lambda$$ Now by using the dot product we can check the angle between $\nabla f$ and $\nabla g$ $$\nabla f=(\frac{\partial f}{\partial x_1} \ \ \frac{\partial f}{\partial x_2})\ \ \ \nabla g=(\frac{\partial g}{\partial x_1} \ \ \frac{\partial g}{\partial x_2})$$ $$\cos(\theta)=\frac{\nabla f \cdot \nabla g}{\Vert \nabla f\Vert \quad \Vert \nabla g\Vert}$$ $$\cos(\theta)=\frac{\frac{\partial f}{\partial x_1}\frac{\partial g}{\partial x_1}+\frac{\partial f}{\partial x_2}\frac{\partial g}{\partial x_2}}{\sqrt{\frac{\partial f}{\partial x_1}^2+\frac{\partial f}{\partial x_2}^2}\sqrt{\frac{\partial g}{\partial x_1}^2+\frac{\partial g}{\partial x_2}^2}}$$ $$\cos(\theta)=\frac{\lambda \frac{\partial g}{\partial x_1}\frac{\partial g}{\partial x_1}+\lambda \frac{\partial g}{\partial x_2}\frac{\partial g}{\partial x_2}}{\sqrt{\lambda^2 \frac{\partial g}{\partial x_1}^2+\lambda^2 \frac{\partial g}{\partial x_2}^2}\sqrt{\frac{\partial g}{\partial x_1}^2+\frac{\partial g}{\partial x_2}^2}}$$ $$\cos(\theta)=\frac{\lambda \left(\frac{\partial g}{\partial x_1}^2 + \frac{\partial g}{\partial x_2}^2\right)}{\lambda \left(\frac{\partial g}{\partial x_1}^2 + \frac{\partial g}{\partial x_2}^2\right)}=1$$ Since $\cos(\theta)=1$ $\nabla f$ and $\nabla g$ are paralel.