Continuity of a function at $x^*$ implies that a property $P(f(x^*))$ holds for points near $x^*$?

54 Views Asked by At

I'm working through Numerical Optimization by Nocedal and Wright, and I'm having trouble with some of its proofs that seem too handwavy to me. Take the first theorem for example:

Theorem: If $x^∗$ is a local minimizer and $f$ is continuously differentiable in an open neighborhood of $x^∗$, then $∇ f (x^∗) = 0$.

Proof (excerpt): Suppose for contradiction that $∇ f (x^∗) \neq 0$. Define the vector $p = −∇ f (x^∗)$ and note that $$p^T ∇ f (x^∗) = − \| f (x^∗)\|^2 < 0.$$ Because $∇ f$ is continuous near $x^∗$, there is a scalar $T > 0$ such that $$\forall t ∈ [0, T ], p^T ∇ f (x^∗ + tp) < 0 $$

Similarly, in another proof (second-order necessary condition), the author assumes $∇^2 f (x^∗)$ is not positive definite (so $\exists p\in\mathbb{R}^n, p^T ∇^2 f (x^∗)p < 0$), and uses the (assumed) continuity of $∇^2 f$ near $x^*$ to conclude the existence of a scalar $T > 0$ such that $\forall t ∈ [0, T ], p^T ∇^2 f (x^∗+tp)p < 0$.

On the surface, the argument roughly seems to be: if $f$ is continuous at $x^*$, and a certain property $P$ holds for $f(x^*)$, then $P$ holds for $f(x)$ when $x$ is near $x^*$. But of course this is not true in general (for example, take $f$ to be $f(x)=x$, $x^*=0$, $P(f(x^*))$ to be $f(x^*)\geq0$).

I would appreciate some help justifying such statements as "Because $∇ f$ is continuous near $x^∗$, there is scalar $T > 0$ such that $\forall t ∈ [0, T ], p^T ∇ f (x^∗ + tp) < 0 $", specifically using the $\epsilon-\delta$ definition of a continuous function in a metric space (what's the appropriate $\epsilon$ here?).

1

There are 1 best solutions below

2
On BEST ANSWER

The essence of this step is, if $\phi : X \to \Bbb{R}$ is continuous at a point $x_0 \in X$ and $\phi(x_0) < 0$, then there exists some $\delta > 0$ such that $\phi(x) < 0$ for all $x \in B(x_0; \delta)$. I hope you can justfy why, for fixed $p$, and continuously differentiable $f$, $$x \mapsto p^\top \nabla f(x)$$ is continuous.

To prove this lemma, take $\varepsilon = |f(x_0)| = -f(x_0) > 0$. By the $\varepsilon$-$\delta$ definition of continuity, we know that there must exist some $\delta > 0$ such that \begin{align*} d(x, x_0) < \delta &\implies |f(x) - f(x_0)| < \varepsilon = -f(x_0) \\ &\implies f(x_0) < f(x) - f(x_0) < -f(x_0) \\ &\implies 2f(x_0) < f(x) < 0. \end{align*} Hence, $f(x) < 0$ for all $x \in B(x_0; \delta)$.