Proof of $\nabla f(x^*) = 0$ is necessary condition for minimizer, using taylor expansions

201 Views Asked by At

enter image description here

Look at this part:

Define the vector $p = -\nabla f(x^*)$ and note that $p^T\nabla f(x^*) = -||\nabla f(x^*)||^2 <0$. Because $f$ is continuous near $x^*$, there is a scalar $T>0$ such that

$p^T\nabla f(x^*+tp) <0, \forall t\in [0,T]$

Why the continuity of the gradient imply that? I understand that because the gradient is continuous, we can move around smoothly and retain the signal. But I'd suppose it works for $\nabla f$ only. Why it works for $p^T\nabla f(x^*+tp)$?

Also, what if I chose $p = \nabla f(x^*)$ instead of the negative?

2

There are 2 best solutions below

0
On BEST ANSWER

I think what you're missing is the following fact: if $F:\mathbb{R}^n\rightarrow\mathbb{R}$ is continuous and satisfies $F(x_0)<0$ then there exists some $\delta>0$ such that $F(x)<0$ for all $x$ such that $|x-x_0|<\delta$. You should try to prove this from the limit definition of continuity. In your particular example, $F$ is the continuous function $p^T\nabla f(x)$ so by the fact above we know that $p^T\nabla f(x)<0$ for all $|x-x^*|<\delta$ for some $\delta$. We then just choose $T$ so that $|x+tp-x^*|<\delta$ whenever $t<T$.

0
On

Let $p \in \mathbb R^n$, and $g:\mathbb R^n \to \mathbb R$, so that we can write $g(x) = (g_1(x),g_2(x),\ldots,g_n(x))$. Write $$G(x) = p^Tg(x) = p_1g_1(x) + p_2g_2(x) + \cdots + p_ng_n(x).$$

If $g$ is continuous, then so each each $g_i$. Hence, $G$ inherits continuity from $G$. Now take $p$ as defined in your textbook, and choose $g = \nabla f$.

You can choose $p = \nabla f(x^*)$ if you wish: you would just need to rejig the proof a little bit to make it work. It reads nicely as is, however, so there's no reason to do this.