The whole idea of the Lagrangian was to incorporate the constraints into the objective function (to get an unconstrained optimization problem), but we are still left with the constraint that $λ\geq 0$. How do we deal with this constraint when solving the problem?
$λ\geq 0$ constraint in Lagrangian
117 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
An inequality constrained problem \begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_i(x) \leq 0, \; i = 1, \ldots, m. \end{aligned}\tag{ICP} \end{equation*} can be written as an equality constraint one
\begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_i(x) = 0, \; \forall j \in \mathcal{A}(x^*) \end{aligned}\tag{ECP} \end{equation*} where $\mathcal{A}(x) = \{ i \colon g_i(x) = 0\}$ is the set of active inequality constraints. ECP can also be written as
\begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_1(x) = 0, \dots, g_r(x) = 0 \end{aligned}\tag{ECP'} \end{equation*} where $r$ is the number of active constraints. The reason for this equivalence is the complementary slackness condition, CS for short. Intuitively, CS tell us whenever $g_j(x^*) \leq 0$ is slack (meaning $g_j(x^*) < 0$), the constraint $\mu_j^* \geq 0$ must not be slack (meaning $\mu^*_j=0$), and reversely. All the above are theoretically justified by the KKT conditions. In practice, when we solve (ECP) we can apply the first order Lagrangian method
$$ x^{k+1} = x^k - a\: \nabla_x \mathcal{L} (x^k, \mu^k) \tag{1}$$ $$\mu^{k+1} = \mu^k + a \: \nabla_{\mu} \mathcal{L} (x^k, \mu^k)\tag{2}$$ with $\mu = (\mu_1, \dots, \mu_r)$. Notice in $(2)$ we have + as we are maximizing the dual problem. You can also see my answer here which give another perspective on how $\mu_j^* \geq 0$ with $j \in \mathcal{A}(x^*)$.
I will try to give a geometric idea about the convention $\lambda \ge 0$
Suppose we have the problem
$$ \max f(x)\ \ \text{s. t.}\ \ g_k(x) \le 0 $$
In this case to qualify a stationary point $x^*$ we can proceed as follows: as $\nabla g_k(x^*)$ points outwards the feasible region, if $\sum_k \lambda_k \nabla g_k(x^*) = \nabla f(x^*)$ with all $\lambda_k \ge 0$, then $x^*$ can be qualified as a local maximum (the point is stuck in the corner). If any of the $\lambda_k\lt 0$ then $f(x)$ can increase following the manifold with tangent plane caracterized by the normal with $\nabla g_k(x^*)$ direction and in this case the stationary point wouldn't be a local maximum.