Optimality condition

82 Views Asked by At

I was looking at a few results of convex optimization and I'm stuck with a part of a proof. Consider the following minimization problem: \begin{align} \text{minimize} \quad &\Phi(x) \\ \text{subject to} \quad & K x = g \end{align}

We have to prove the following statement: Let $\Phi: \mathbb R^n \to \mathbb R$ be a convex and differentiable function with domain $D$. Let X be the feasible set: $X=\left\{x: Kx=g \right\}$.

If $x \in X$ is optimal, then the following holds: \begin{equation} \nabla \Phi(x)^T (y - x) \geq 0 \qquad \forall y \in X \end{equation}

(It's actually an if and only if, but the other implication is rather easy)

To prove it, the author uses the reductio ad absurdum. Suppose $x$ is optimal but the condition is not satisfied: there exist $y \in X$ such that: \begin{equation} \nabla \Phi(x)^T (y - x) < 0. \end{equation}

Then, consider the point $z(t) = ty + (1-t)x$, with $t \in [0,1]$. He proves that $z(t)$ is feasible, i.e. $z(t) \in X$. Then he observes that:

\begin{equation} \biggl[\frac{d}{dt}\Phi(z(t))\biggr]_{t=0} = \nabla \Phi(x)^T (x - y) < 0 \end{equation}

The proof is concluded by saying that the last inequality implies that $\Phi(z(t)) < \Phi(x)$, which contradicts the optimality of $x$.

I don't get the last implication. How do you go from: \begin{equation} \biggl[\frac{d}{dt}\Phi(z(t))\biggr]_{t=0} < 0 \end{equation} to $\Phi(z(t)) < \Phi(x)$?

I should probably use the fact that a differentiable function $\Phi$ is convex if and only if, for all $x,y$ in its domain it holds \begin{equation} \Phi(y) \geq \Phi(x) + \nabla \Phi(x)^T(y-x) \end{equation} but I can't see how. I know it is probably a silly/very trivial question, but I'm really stuck.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $f:= \Phi \circ z \colon [0,1] \to\mathbf R$. Then $f$ is differentiable at $0$, that is $$ f(h) = f(0)+ f'(0)h +o(h), \qquad h \to 0$$ Now $f(0) = \Phi(x)$, and $f'(0) < 0 $, let $\epsilon :=\frac 12 |f'(0)|$, choose $\delta > 0$ such that for $h \in (0,\delta)$, we have $|o(h)| < \epsilon|h|$. Then for $h \in (0,\delta)$ $$ f(h) = \Phi(x) + f'(0)h +o(h) \ge \Phi(x) + f'(0)h -\frac12 f'(0)h = \Phi(x) +\frac 12 f'(0)h < \Phi(x) $$ as $f'(0)h < 0$. So this follows by the very definition of differentiability.