Are KKT conditions still necessary and sufficient for optimality in a nonlinear max problem with pseudo-concave objective?

89 Views Asked by At

We all know that KKT conditions are necessary and sufficient conditions for optimality in a convex minimization problem (or a concave maximization problem). Recently, I found that the convexity/concavity requirement could be generalized to be pseudo-convexity/pseudo-concavity. In Proposition 2.8 of Zappone et al. 2015: \begin{align*} \max_{x} \ &r(x) \\ \text{s.t.} \ &c_i(x) \ge 0, \qquad \forall i =1, \dots, I \end{align*} with $r$, $c_i$'s: $\mathcal{C} \subseteq \mathbb{R}^n \rightarrow \mathbb{R}$ are differentiable. Suppose $r$ is pseudo-concave and $c_i$'s are quasi-concave. Then, assuming a constraint qualification holds, the KKT conditions are necessary and sufficient for the optimality of the above problem.

The corresponding proof in this paper makes sense to me. However, this paper is the only one that makes the generalization. I have never seen this in any other literature or optimization textbooks. I am posting this to check if there is any tiny mistake in the proof I didn't notice. Thanks in advance!

1

There are 1 best solutions below

2
On

I find another proof for the generalization to pseudo-convexity/pseudo-concavity. Please refer to Theorems 11 and 16 in Freund 2004.

Consider a constrained optimization problem $$ \begin{array}{ll} \text { (P) } & \min_{x} f(x) \\ \text { s.t. } & g(x) \leq 0 \\ & h(x)=0 \\ & x \in X, \end{array} $$ where $X$ is an open set and $g(x)=\left(g_1(x) \ldots, g_m(x)\right): \Re^n \rightarrow \Re^m$, $h(x)=\left(h_1(x) \ldots, h_l(x)\right): \Re^n \rightarrow \Re^l$.

And we write down the Karush-Kuhn-Tucker conditions: \begin{align*} \text{ (KKT) } && \begin{cases} \nabla f(\bar{x})+\nabla g(\bar{x})^T u+\nabla h(\bar{x})^T v=0, \\ u \geq 0, \\ u_i g_i(\bar{x})=0, i=1, \ldots, m . \end{cases} \end{align*}

Necessity of $(KKT)$: Let $\bar{x}$ be a feasible solution of $(P)$ and let $I=\left\{i: g_i(\bar{x})=0\right\}$. Further, suppose that $\nabla h_i(\bar{x})$ for $i=1, \ldots, l$ and $\nabla g_i(\bar{x})$ for $i \in I$ are linearly independent. If $\bar{x}$ is a local minimum, there exists $(u, v)$ satisfying $(KKT)$.

Proof: $\bar{x}$ must satisfy the Fritz John conditions. If $u_0>0$, we can redefine $u \leftarrow u / u_0$ and $v \leftarrow v / u_0$. If $u_0=0$, then $$ \sum_{i \in I} u_i \nabla g_i(\bar{x})+\sum_{i=1}^l v_i \nabla h_i(\bar{x})=0, $$ and so the above gradients are linearly dependent. This contradicts the assumptions of the theorem.

Sufficiency of $(KKT)$: Let $\bar{x}$ be a feasible solution of $(P)$, and suppose $\bar{x}$ together with multipliers $(u, v)$ satisfies $(KKT)$. If $f(x)$ is a pseudo-convex function, $g_i(x), i=1, \ldots, m$ are quasi-convex functions, and $h_i(x), i=1, \ldots, l$ are linear functions, then $\bar{x}$ is a global optimal solution of $(P)$.

Proof: Because each $g_i(x)$ is quasi-convex, the level sets $$ C_i:=\left\{x \in X: g_i(x) \leq 0\right\}, i=1, \ldots, m $$ are convex sets. Also, because each $h_i(x)$ is linear, the sets $$ D_i=\left\{x \in X: h_i(x)=0\right\}, i=1, \ldots, l $$ are convex sets. Thus, since the intersection of convex sets is also a convex set, the feasible region $$ S=\{x \in X: g(x) \leq 0, h(x)=0\} $$ is a convex set.

Let $I=\left\{i \mid g_i(\bar{x})=0\right\}$ denote the index of active constraints at $\bar{x}$. Let $x \in S$ be any point different from $\bar{x}$. Then $\lambda x+(1-\lambda) \bar{x}$ is feasible for all $\lambda \in(0,1)$. Thus for $i \in I$ we have $$ g_i(\lambda x+(1-\lambda) \bar{x})=g_i(\bar{x}+\lambda(x-\bar{x})) \leq 0=g_i(\bar{x}) $$ for any $\lambda \in(0,1)$, and since the value of $g_i(\cdot)$ does not increase by moving in the direction $x-\bar{x}$, we must have $\nabla g_i(\bar{x})^T(x-\bar{x}) \leq 0$ for all $i \in I$.

Similarly, $\nabla h_i(\bar{x}+\lambda(x-\bar{x}))=0$, and so $\nabla h_i(\bar{x})^T(x-\bar{x})=0$ for all $i=1, \ldots, l$.

Thus, from the $(KKT)$, $$ \nabla f(\bar{x})^T(x-\bar{x})=-\left(\nabla g(\bar{x})^T u+\nabla h(\bar{x})^T v\right)^T(x-\bar{x}) \geq 0, $$ and by pseudo-convexity, $f(x) \geq f(\bar{x})$ for any feasible $x$.