Zero-order necessary conditions

207 Views Asked by At

I have a question regarding the Zero-order necessary conditions. In my Linear and Nonlinear programming book it is stated:

enter image description here

Consider the set $\Gamma \subset E^{n+1} = \{(r,\textbf{x}): r\geq f(\textbf{x}), \textbf{x}\in E^n\}.$ In a figure of the graph of $f$, the set $\Gamma$ is the region above the graph, shown in the upper part of the figure. This set is called the epigraph of $f$. It is easy to verify that the set $\Gamma$ is convex if $f$ is convex function. Suppose that $\textbf{x}^*\in \Omega$ is the minimizing point with value $f^*= f(\textbf{x}^*).$ We construct a tubular region with cross section $\Omega$ and extending vectically from $-\infty$ up to $f^*$, shown as $B$ in the upper part of the figure. This is also a convex set, and it overlaps the set $\Gamma$ only at the boundary point $(f^*, \textbf{x}^*)$ above $\textbf{x}^*$ (or possibly many boundary points if $f$ is flat near $\textbf{x}^*$).

According to the separating hyperplane theorem, there is a hyperplane separating these two sets. This hyperplane can be represented by a nonzero vector of the form $(s,\boldsymbol \lambda)\in E^{n+1}$ with $s$ a scalar and $\boldsymbol \lambda\in E^n$, and a separation constant $c$. The separation conditions are

$$sr + \boldsymbol \lambda^T\textbf{x} \geq c\;\;\;\text{for all $\textbf{x} \in E^n$ and $r\geq f(\textbf{x})$}\;\;\;\;\;(1)$$ $$sr + \boldsymbol \lambda^T\textbf{x} \leq c\;\;\;\text{for all $\textbf{x} \in \Omega$ and $r\leq f^*$}.\;\;\;\;\;\;\;\;\;(2)$$

It follows that $s\neq0$; for otherwise $\boldsymbol \lambda \neq\textbf{0}$ and then (1) would be violated for some $\textbf{x}\in E^n$. It also follows that $s\geq0$ since otherwise (2) would be violated by very negative values of $r$. Hence, together we find $s>0$ and by appropriate scaling we may take $s=1$.

Next, consider the problem:

$$\text{minimize}\;\;\;f(\textbf{x})$$ $$\text{subject to}\;\;\;\textbf{x}\in\Omega$$

The stated conditions (1) and (2) can be expressed alternatively in the following way:

Zero-order necessary conditions: If $\textbf{x}^*$ solves the above problem under stated convexity conditions, then there is a nonzero vector $\boldsymbol \lambda \in E^n$ such that $\textbf{x}^*$ is a solution to the two problems:

$$\text{minimize}\;\;\;f(\textbf{x}) + \boldsymbol \lambda^T\textbf{x}$$ $$\text{subject to}\;\;\;\textbf{x}\in E^n$$

and

$$\text{maximize}\;\;\;\boldsymbol \lambda^T\textbf{x}$$ $$\text{subject to}\;\;\;\textbf{x}\in \Omega$$

My question is: "What happened to the variable $r$ in the minimization and maximization problem stated last? In the minimization problem, is it because $r$ was set to equal $f(\textbf{x})$? Why in the maximization problem the $r$ is simply dropped out?"

In other words: "How did the Zero-order necessary conditions follow from the above derivations?"

Hope my question is clear =) Thank you for any help =)