Why doesn't minimizing the Lagrangian derivative with regards to the multiplier work?

Question

Why doesn't minimizing the Lagrangian derivative with regards to the multiplier work?

99 Views Asked by Bumbble Comm At 02 Apr 2026 - 12:14

Given a typical optimization problem for a scalar function of a vector $x$ with a set of equality constraints $$ \min_x f(x)\\ \text{s.t.}\quad g(x)=\mathbf{0} $$ To solve the above, one would define a Lagrangian function with a vector of multipliers for each of the constraints as follows $$ \mathcal{L}(x,\lambda) = f(x)+g(x)^T\lambda $$ Using subscripts to denote differentiation, the solution $(x^*,\lambda^*)$ is found when $$ \mathcal{L}_x(x^*,\lambda^*)=0 $$ When solving using numerical method, one would use the Hessian to solve the following systems of equations to calculate a step $$ \left(\begin{matrix}\mathcal{L}_{xx} & g_x^T \\ g_x & 0\end{matrix}\right) \left(\begin{matrix}s_x \\ s_\lambda \end{matrix}\right) = \left(\begin{matrix}-\mathcal{L}_x \\ -g \end{matrix}\right) $$ such that the system will proceed to a new point $(x+s_x,\lambda+s_\lambda)$. In my experience with solving this linear system, setting the initial $\lambda$ to be large is usually enough to make it the solver run smoothly.

Now, since the solution requires $\mathcal{L}_x=0$, in order to accelerate the process I am tempted to try solve the following problem $$ \min_\lambda \frac{1}{2}|\mathcal{L}_x|^2 $$ for which the minimizing $\lambda$ is found at $$ \lambda = -(g_xg_x^T)^{-1}g_xf_x $$ So this $\lambda$ would be calculated first, and after that, $\mathcal{L}_{x}$ and $\mathcal{L}_{xx}$. But when I tried this trick, the solver failed miserably. Why is that so?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2021-02-03 10:36:35

If there are fewer constraints than variables, then $g: \mathbb{R}^K \rightarrow \mathbb{R}^J$ with $J<K$. If the constraints are not redundant then $rk(g_x(x))=J$.

The optimal values satisfy the first order conditions $$ f_x(x)+g_x(x)^T\lambda=0,$$ a system of $K$ equations with $K+J$ unknown $(x,\lambda)$.

After inserting your expression for $\lambda$ into the first order conditions, you obtain the reduced system of $K$ equations and $K$ unknown: \begin{equation} \tag{1} (I_K-g_x^T(g_xg_x^T)^{-1}g_x)f_x(x)=0. \end{equation}

This system has an infinite number of solutions. Indeed, the matrix $P:=g_x^T(g_xg_x^T)^{-1}g_x$ is an orthogonal projector onto $col(g^T_x)$, and $rk(P)=J<K$. The matrix $I_K-g_x^T(g_xg_x^T)^{-1}g_x$ is an orthogonal projector onto $col^\perp (g^T_x)$ and the rank of this matrix is $K-J$.
If you consider only the system (1), your solution is undetermined, but if you exploit the information comprised in the $J$ constraints $g(x)=0$ and include them to the system (1) of $K-J$ independent equations, this would be helpful to reduce the indeterminatness of the solution, and to find possibly a finite number of solutions.

Why doesn't minimizing the Lagrangian derivative with regards to the multiplier work?

There are 1 best solutions below

Related Questions in LAGRANGE-MULTIPLIER

Related Questions in NUMERICAL-OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions