Optimization : is it possible to replace 2 Lagrange multipliers by a single one.

164 Views Asked by At

The question is simple: "find the point closest to the origin that is on the intersection line of $y+2z=12$ and $x+y=6$."

Normal method would use two Lagrange multipliers and get the point $(2,4,4)$. I get that. BUT I used the following method and failed to get the same result:

I designed an equivalent constraint from the 2 constraints given above:

$g(x,y,z) = (y+2z-12)^2 + (x+y-6)^2 = 0$

And then I tried to use only one Lagrange multiplier since we only have one constraint now. However, I cannot seem to get $(2,4,4)$ So my question is: what went wrong? Did I miss something?

2

There are 2 best solutions below

9
On BEST ANSWER

$\mathbf{EDIT}$: I previously posted a "solution" agreeing that the OP's method works, but I was incorrect; this is a correct (to the best of my knowledge) revision.

The method of Lagrange multipliers for a single constraint seeks to extremize a real-valued function $f(\vec{x})$ subject to the constraint $g(\vec{x})=0$. That is, the solution $\vec{x}_0$ must extremize the restriction of $f$ to $S \equiv \{\vec{x} \in \mathbb{R}^n | g(\vec{x})=0 \}$. The method makes use of the fact that, if $\nabla g(\vec{x}_0) \neq 0$, then $S$ is a differentiable surface in a neighborhood of $x_0$ with normal vector $\nabla g(\vec{x}_0)$, and since the restriction $f|_S$ is extremized at $x_0$, we must then have that $\nabla f(\vec{x}_0)$ is normal to the surface (or else $f$ would be increasing along a curve through $S$ with tangent at $\vec{x}_0$ equal to the projection of $\nabla f(\vec{x}_0)$ onto the tangent space $T_{\vec{x}_0}S$). That is, $\nabla f(\vec{x}_0) = \lambda_0 \nabla g(\vec{x}_0)$ for some $\lambda_0 \in \mathbb{R}$. So, if one defines the function $$L(\vec{x},\lambda) \equiv f(\vec{x})-\lambda g(\vec{x}) $$ Then $\nabla L(\vec{x}_0,\lambda_0)=0$ ($\nabla$ here is the gradient in $n+1$ dimensions). The above logic and therefore this conclusion, however, were contingent on the statement that $\nabla g(\vec{x}_0) \neq 0$. If this is not satisfied, the extremizer $\vec{x}_0$ of $f|_S$ need not satisfy $\nabla L(\vec{x}_0,\lambda) = 0$ for any $\lambda \in \mathbb{R}$, and we may not be able to recover the solution $\vec{x}_0$ by finding stationary points of $L$, and this is the precisely the problem in your approach.

Indeed, if $g$ is a sum of squares of functions, $g(\vec{x})=\sum_k (g_k(\vec{x}))^2$, then $g(\vec{x})=0 \iff g_k(\vec{x})=0$ for each $k$, so for every $\vec{x}_0 \in S$ $$\nabla g(\vec{x}_0) = \sum_k 2g_k(\vec{x}_0) \nabla g_k(\vec{x}_0) = 0$$ So no point in $S$ satisfies the hypotheses necessary to rely on the Lagrange method.

6
On

A main reason is that, for the method of Lagrange multipliers, setting :

$$g(x,y,z) = (y+2z-12)^2 + (x+y-6)^2 \color{red}{= 0}\tag{1}$$

(I understand that you want it to be zero for the equivalence with your two linear equations)

or setting

$$g(x,y,z) = (y+2z-12)^2 + (x+y-6)^2 \color{red}{= k}\tag{2}$$

for any constant $k$ is the same. (constant $k$ disappears in differentiation).

Equation (2) describes a family of "russian dolls" cylinders $C_k$.


Edit 1 : in fact, encountering cylinders could be a good thing, because your problem could have been turned into this one ; take increasing radii circular cylinders $\Gamma_R$ with common axis the intersection line $(L)$ of $y+2z=12$ and $x+y=6$ and radius $R$. Stop when $R$ is such that $\Gamma_R$ passes through the origin. And this $R$ is the looked-for distance.

But the issue is that, cylinders $C_k$ described in the first part do not grow in the good way (in fact they are elliptical cylinders) : the value of $k$ such that $C_k$ passes through the origin cannot be related to a distance...

But, we could remedy to this situation by defining $L$ otherwise, as the intersection of planes

$$P_1=u_1x+v_1y+w_1z-h_1=0, \ \ \ \text{and} \ \ \ P_2=u_2x+v_2y+w_2z-h_2=0,$$

  • perpendicular one to the other ($u_1u_2+v_1v_2+w_1w_2=0$),

  • with normalized coefficients ($u_k^2+v_k^2+w_k^2=1$, $k=1,2$).

Then, by replacing $(y+2z−12)^2+(x+y−6)^2$ by $P_1^2+P_2^2$ we have now an expression which is the square of the distance to axis $(L)$, and we can use what we have said before.


Edit 2 : in fact one cannot rigorouly speak of $\vec{grad}(g)$ on $g(x,y,z)=0$ because this expression is equivalent to the equation of the straight line $(L)$ and a straight line in 3D has no gradient.