why is the solution of an optimization problem given by (local) optima of the lagrange function?

67 Views Asked by At

Maybe it is already answered here but there are way too many suggestions popping up including lagrange..

My question is rather basic: Why does introducing a lagrange multiplicater transform the initial solution of the optimization task into a local optima of the lagrange function? Or better described: Why is the solution given by the corresponding optima?

1

There are 1 best solutions below

1
On BEST ANSWER

$\nabla f$ is the direction of greatest local increase in the function $f$. Moving perpendicular to $\nabla f$ causes no local change in the function. Thus, a local optimum on the constraint surface must occur when every direction you can move on the surface is perpendicular to $\nabla f$.

If the equation of the constraint surface is given in the form $g(\mathbf{x}) = c$, then $\nabla g$ is everywhere perpendicular to the constraint surface. We see this because, as you move along the constraint surface, $g$ doesn't change, so every direction you can move on the surface must always be in a direction of no local change in $g$ (i.e., perpendicular to $\nabla g$).

So, since $\nabla g$ is perpendicular to the constraint surface everywhere and $\nabla f$ is perpendicular to the constraint surface at local optima, local optima must occur at places where $\nabla g$ is parallel to $\nabla f$. This means there is some real constant $\lambda$ such that $\nabla f = \lambda \nabla g$ at each local optimum, or equivalently, that $$ \nabla (f - \lambda g) = 0. $$

For multiple constraints of the form $g_i(\mathbf{x}) = c_i$, the constraint surface is perpendicular to each gradient $\nabla g_i$ at all of its points, and $\nabla f$ is perpendicular to the constraint surface whenever it is some linear combination of the $\nabla g_i$. Thus, in the multiple constraint case, we have $$ \nabla \left(f - \sum_i \lambda_i g_i\right) = 0. $$