How to imply the vanishing gradient condition in KKT?

Question

How to imply the vanishing gradient condition in KKT?

144 Views Asked by Bumbble Comm At 10 May 2026 - 4:11

In Boyd's Convex Optimisation, the following optimisation problem is considered $$ \begin{align} \min\quad & f_0(x)\\ \text{s.t.}\quad & f_i(x)\le 0,\quad i=1:m,\quad m\in\Bbb Z_{\ge 0}\\ & h_i(x) = 0,\quad j=1:p,\quad p\in\Bbb Z_{\ge 0} \end{align}\tag{P} $$ For they time being the functions $f_0,f_{1:m},h_{1:p}:\Bbb R^n\to\Bbb R$ are assumed $C^1$ but not necessarily convex. Let $D$ be the the feasible domain specified by the conditions $f_{1:m}(x)\le 0$ and $h_{1:p}(x)=0$. Then the associated Lagrangian is $$L:D\times \Bbb R^m\times\Bbb R^p\ni(x,\lambda,\mu)\mapsto f_0(x)+\lambda^T f(x) + \mu^T h(x)\in\Bbb R$$ where $f:=(f_1,\cdots,f_m)^T$ and $h:=(h_1,\cdots,h_p)^T$, and the Lagrange function is thus defined as $$g:\Bbb R^m\times \Bbb R^p\ni(\lambda,\mu)\mapsto \inf_{x\in D}L(x,\lambda,\mu)\in\Bbb R.$$ We then formulate the dual problem (as opposed to the primal (P)) as $$ \begin{align} \max\quad & g(\lambda, \mu)\\ \text{s.t.}\quad & \lambda \in\Bbb R_{\ge 0}^m\\ \quad & \mu\in\Bbb R^p \end{align}\tag{D} $$ Now, suppose $\bar x$ and $(\bar\lambda,\bar\mu)$ are a dual feasible pair for (P) and (D), and suppose that strong duality holds. Then we can show easily that $$\DeclareMathOperator*{\argmin}{argmin} f_0(\bar x) = g(\bar\lambda,\bar\mu)=\inf_{x\in \color{red}{D}}L(x,\bar\lambda,\bar\mu)\le L(\bar x, \bar\lambda,\bar\mu)\le f_0(\bar x)\implies \bar x=\operatorname*{argmin}_{x\in\color{red}{D}}L(x,\bar\lambda,\bar\mu).$$ But then Boyd claims that, "since $\bar x$ minimises $L$ over $x$, its gradient must vanish at $\bar x$", i.e. that $\nabla_x L(\bar x,\bar\lambda, \bar\mu)=0$. WHY is this true?

Note that Fermat's lemma may not apply: $\bar x$ just minimises $L(\cdot,\bar\lambda,\bar\mu)$ within the primal feasible domain $\color{red}{D}$, to which $\bar x$ is NOT necessarily an interior point, and even $D$ itself may not have non-empty interior in $\Bbb R^n$. So why can we conclude the gradient w.r.t. $x$ vanishes at $\bar x$?

Thank you.

As LittleO kindly pointed out, my problem is actually a non-problem that arose from the confusion over the definition of $D$: it is NOT the feasible region of $(P)$; rather, it is the intersection of domains of $f$ and $h$ which are assumed open (hence making $D$ open too).

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$D$ is not the set of all points $x$ where the constraints are satisfied. Rather, as stated at the beginning of section 5.1, $D = \cap_i \textbf{dom } f_i \, \cap \, \cap _i \textbf{dom } h_i$.

At the beginning of section 5.5.3, the book assumes that the functions $f_i$ and $h_i$ have open domains. It follows that $D$ is an open set. Thus, Fermat's lemma does apply.

By the way, one way to motivate the Lagrangian intuitively is that we are attempting to replace the hard constraints (which are difficult to handle) with linear penalty terms in the objective function (which are hopefully easy to handle). By introducing the Lagrangian, we hope to solve an unconstrained minimization problem rather than a constrained problem. (But, this strategy requires us to figure out the correct values of the Lagrangian multipliers -- which leads us to solving the dual problem.)

How to imply the vanishing gradient condition in KKT?

There are 1 best solutions below

Related Questions in REAL-ANALYSIS

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in KARUSH-KUHN-TUCKER

Trending Questions

Popular # Hahtags

Popular Questions