I am confused about the KKT conditions. I have seen similar questions asked here, but I think none of the questions/answers cleared up my confusion.
In Boyd and Vandenberghe's Convex Optimization [Sec 5.5.3] , KKT is explained in the following way.
I-For any differentiable (potentially non-convex) problem: If strong duality holds, then any primal/dual (global) optimal pair must satisfy the KKT conditions (i.e., gradient of Lagrangian must vanish, points must be primal/dual feasible, and they must satisfy complementary slackness).
II-For convex problems: If the problem is convex, then (a) any (primal/dual) points that satisfy the KTT conditions (same as above) are (global) primal/dual optimal pairs and (b) strong duality holds.
Using I and II, Boyd and Vandenberghe conclude that for convex problems that satisfy the Slater's condition (hence strong duality holds), KKT conditions are both necessary and sufficient for (global) primal/dual optimality.
Now in traditional Nonlinear Programming textbooks, the same KKT conditions are presented as first-order necessary condition for local optimality for any (differentiable, but potentially non-convex) problem. In those references, there is no discussion of dual points (instead, we treat them as Lagrange multipliers) or strong duality: (III) for any regular locally optimal (primal) point, there must exist Lagrange multipliers such that jointly they satisfy the KKT conditions (same as above).
I have three related questions:
(Q1) does III imply that the strong duality requirement in I was unnecessary? (edit: I realized that III is a necessary condition for regular local optima - but still, it would be great to hear about the relation between I and III)
(Q2) What can be said in general about the KKT conditions in differentiable nonlinear programs that do not satisfy strong duality?
(Q3) Consider a general nonlinear program (primal) with differentiable cost and constraints where strong duality does not hold. Now imagine I have found all KKT pairs for the primal. The Lagrange multipliers in my KKT pairs are clearly feasible for the dual problem. But is it also guaranteed that every regular local optima of the dual problem appears in my KKT pairs of the primal?
My guess: I guess the answer to Q1 is negative - if strong duality does not hold, regular primal (global/local) optimal points must still satisfy the KKT conditions with some Lagrange multipliers that may not have anything to do with (optimal) dual points (?).
I think that your guess in (Q1) is correct.
Consider the following optimization problem: \begin{align} &\min_{x\in \mathbb{R}^n}\ f_0(x)\\ &\mathrm{s.t.}\ \ f_i(x) \le 0, \ i=1,2, \cdots,m\\ &\qquad h_j(x) = 0, \ j=1,2,\cdots, p \end{align} where $f_0$, $f_i, \forall i$ and $h_j, \forall j$ are all differentiable. The KKT conditions are the following \begin{align} \nabla f_0(x^\ast) + \sum_{i=1}^m \lambda_i^\ast \nabla f_i(x^\ast) + \sum_{j=1}^p \mu_j^\ast \nabla h_j(x^\ast) &= 0, \\ f_i(x^\ast) &\le 0, \ i = 1, 2, \cdots, m\\ h_j(x^\ast) &= 0, \ j=1, 2, \cdots, p\\ \lambda_i^\ast &\ge 0, \ i=1, 2, \cdots, m\\ \lambda_i^\ast f_i(x^\ast) &= 0, \ i = 1, 2, \cdots, m. \end{align}
See: [1], and [2], page 356, Ch. 9.
The Lagrangian is given by $$L(x, \lambda, \mu) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{j=1}^p \mu_i h_i(x)$$ with $\lambda \in \mathbb{R}^m$ and $\mu \in \mathbb{R}^p$.
The Lagrange dual function is given by $$g(\lambda, \mu) = \inf_{x} L(x, \lambda, \mu).$$
Let $x^\ast$ and $(\lambda^\ast, \mu^\ast)$ be primal and dual optimal, respectively.
It holds that $g(\lambda, \mu) \le f(x^\ast)$ for all dual feasible $(\lambda, \mu)$, i.e. $\lambda \ge 0$ and $(\lambda, \mu) \in \mathrm{dom}\, g$. This property is called weak duality.
If $f(x^\ast) = g(\lambda^\ast, \mu^\ast)$, the strong duality holds (zero duality gap).
We have:
If $x^\ast$ is locally optimal and $x^\ast$ is regular (regularity conditions, or constraint qualifications), then there exists $(\lambda^\ast, \mu^\ast)$ such that the KKT conditions hold.
If strong duality holds, the KKT conditions are necessary optimality conditions: if $x^\ast$ and $(\lambda^\ast, \mu^\ast)$ are primal and dual optimal, then the KKT conditions hold.
For convex problems with strong duality (e.g., when Slater's condition is satisfied), the KKT conditions are sufficient and necessary optimality conditions, i.e., $x^\ast$ and $(\lambda^\ast, \mu^\ast)$ are primal and dual optimal if and only if the KKT conditions hold.
Reference
[1] https://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions
[2] Chong-Yung Chi, Wei-Chiang Li, Chia-Hsiang Lin, "Convex Optimization for Signal Processing and Communications: From Fundamentals to Applications", 2017.