Suppose there is a function $f: X \rightarrow \mathbb{R}$, where $X \subseteq \mathbb{R}^n$.
If $x^*$ is a local minimizer of $f$ over $X$, must $x^*$ be either of the two cases:
- if $f$ is differentiable at $x^*$, then $x^*$ must be a stationary point i.e. $\nabla{f}(x^*)=0$
- $f$ is non-differentiable at $x^*$?
In other words, are there other possible cases for a local minimizer besides the two? I seem to have seen this as a conclusion from somewhere that I now cannot recall. However the following proposition seems to challenge this.
From p194 on Nonlinear Programming by Dimitri P. Bertsekas:
If $X$ is a convex set.
Proposition 2.1.2: (Optimality Condition)
(a) If $x^*$ is a local minimum of $f$ over $X$. then $$ \nabla{f}(x^*)'(x - x^*) > 0, \forall x \in X. $$
(b) If $f$ is convex over $X$, then the necessary condition of part (a) is also sufficient for $x^*$ to minimize $f$ over $X$.
If the conclusion in Part 1 is true, then if $f$ is differentiable at $x^*$, we will have $ \nabla{f}(x^*)=0$. Under this logic, I was confused why we have the proposition (a) as above in the book?
How to interpret the proposition geometrically?
Thanks and regards!
Dimitri Bertsekas is right. Your conclusion 1 cannot hold in general because when $X \subset \mathbb{R}^n$, and $x$ lies on the boundary of $X$, you are talking about a constrained minimizer. Note that $X$ must be closed for optimization to make sense. Consider for instance $X=[0,1] \subset \mathbb{R}$ and $f(x) = -e^x$.
The condition given in part (a) of Proposition 2.1.2 is that there exist no (first-order) feasible descent direction for $f$ from $X$. Note that the convexity of $X$ is assumed here.
In general, $x \in X$ is a constrained minimizer if, by definition, there is $\epsilon > 0$ such that for all $y \in X$ such that $\|x-y\| < \epsilon$, you have $f(x) \leq f(y)$. This applies even if $f$ is not differentiable.
It's not difficult to show that if $f$ is differentiable, this implies that $-\nabla f(x)$ lies in the cone normal to $X$ at $x$. You'll find the definition of the normal and tangent cones, e.g., in the book by Bazaraa, Sherali and Shetty or pretty much any good optimization book. When $X$ is convex, this is exactly what part (a) of Proposition 2.1.2 says.