Stationarity condition for unit ball

57 Views Asked by At

Derivation of the necessary and sufficient stationarity conditions over the unit ball.

For $x$ in the unit ball, $x$ is a minimum point if $\nabla f(x) = 0$ or $||x|| = 1$ and $\exists l \leq 0$ s.t $\nabla f(x) = l*x$

Of course the first part makes sense however I don't understand how the second part of this is derived. I can see that using the definition of convexity it is obvious that $||x|| = 1$ however I'm not sure how to get the second part of the result.

1

There are 1 best solutions below

0
On

$\def\eqdef{\stackrel{\text{def}}{=}}$ As given, the statement is not true. The given conditions are not generally sufficient for $\ x\ $ to be a minimum of $\ f\ $ (although they are sufficient if you restrict $\ f\ $ to be convex). They are necessary however. Here's a proof for the case when the minimum $\ x\ $ lies on the surface of the unit ball.

If $\ x\ $ is a minimum of $\ f\ $ over the unit ball, and $\ \|x\|=1\ $, let $\ u\ $ be any unit vector perpendicular to $\ x\ $ and define $$ g(t)\eqdef f(\cos(t)x+\sin(t)u) $$ Then since $\ \|\cos(t)x+\sin(t)u\|=1\ $ for all $\ t\ $ we must have \begin{align} g(t)&=f(\cos(t)x+\sin(t)u)\\ &\ge f(x)\\ &=g(0)\ . \end{align} Thus $\ g(t)\ $ is at a minimum when $\ t=0\ $. Therefore \begin{align} 0&=g'(0)\\ &=\langle \nabla f(x),u\rangle \end{align} Since this holds for all unit vectors perpendicular to $\ x\ $ it follows that \begin{align} \nabla f(x)&\in(\{u\}^\perp)^\perp\\ &=\{\,\lambda x\,|\,\lambda\in\Bbb{R}\,\}\ , \end{align} or, in other words, $\ \nabla f(x)=\lambda x\ $ for some $\ \lambda\in\Bbb{R}\ $. Now define $$ h(t)\eqdef f(tx)\ . $$ Then for all $\ t\in[-1,1]\ $, $\ tx\ $ is in the unit ball, and therefore \begin{align} h(t)&=f(tx)\\ &\ge f(x)\\ &=h(1)\ . \end{align} Thus $\ h(t)\ $ is at a minimum over the interval $\ [-1,1]\ $ when $\ t=1\ $. Therefore \begin{align} 0&\ge h'(1)\\ &=\langle \nabla f(x),x\rangle\\ &=\lambda\langle x,x\rangle\\ &=\lambda \end{align} which completes the proof that $\ \nabla f(x)=\lambda x\ $ for some $\ \lambda\le0\ $.