$λ\geq 0$ constraint in Lagrangian

117 Views Asked by At

The whole idea of the Lagrangian was to incorporate the constraints into the objective function (to get an unconstrained optimization problem), but we are still left with the constraint that $λ\geq 0$. How do we deal with this constraint when solving the problem?

2

There are 2 best solutions below

1
On

I will try to give a geometric idea about the convention $\lambda \ge 0$

Suppose we have the problem

$$ \max f(x)\ \ \text{s. t.}\ \ g_k(x) \le 0 $$

In this case to qualify a stationary point $x^*$ we can proceed as follows: as $\nabla g_k(x^*)$ points outwards the feasible region, if $\sum_k \lambda_k \nabla g_k(x^*) = \nabla f(x^*)$ with all $\lambda_k \ge 0$, then $x^*$ can be qualified as a local maximum (the point is stuck in the corner). If any of the $\lambda_k\lt 0$ then $f(x)$ can increase following the manifold with tangent plane caracterized by the normal with $\nabla g_k(x^*)$ direction and in this case the stationary point wouldn't be a local maximum.

0
On

An inequality constrained problem \begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_i(x) \leq 0, \; i = 1, \ldots, m. \end{aligned}\tag{ICP} \end{equation*} can be written as an equality constraint one

\begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_i(x) = 0, \; \forall j \in \mathcal{A}(x^*) \end{aligned}\tag{ECP} \end{equation*} where $\mathcal{A}(x) = \{ i \colon g_i(x) = 0\}$ is the set of active inequality constraints. ECP can also be written as

\begin{equation*} \begin{aligned} & \underset{x}{\text{minimize}} & & f(x) \\ & \text{subject to} & & g_1(x) = 0, \dots, g_r(x) = 0 \end{aligned}\tag{ECP'} \end{equation*} where $r$ is the number of active constraints. The reason for this equivalence is the complementary slackness condition, CS for short. Intuitively, CS tell us whenever $g_j(x^*) \leq 0$ is slack (meaning $g_j(x^*) < 0$), the constraint $\mu_j^* \geq 0$ must not be slack (meaning $\mu^*_j=0$), and reversely. All the above are theoretically justified by the KKT conditions. In practice, when we solve (ECP) we can apply the first order Lagrangian method

$$ x^{k+1} = x^k - a\: \nabla_x \mathcal{L} (x^k, \mu^k) \tag{1}$$ $$\mu^{k+1} = \mu^k + a \: \nabla_{\mu} \mathcal{L} (x^k, \mu^k)\tag{2}$$ with $\mu = (\mu_1, \dots, \mu_r)$. Notice in $(2)$ we have + as we are maximizing the dual problem. You can also see my answer here which give another perspective on how $\mu_j^* \geq 0$ with $j \in \mathcal{A}(x^*)$.