In my multivariable calculus class notes my teacher wrote the following:
Let $f: \mathbb R^n \to \Bbb R$ be a function differentiable at $p \in \Bbb R^n$, and let $v \in \Bbb R^n$. Then:
$$f'(p)(v) = \nabla f(p)\cdot v = \|\nabla f(p)\|\cdot \|v\| \cdot \cos(\theta)$$ Where $\theta $ is the angle between $\nabla f(p)$ and $v$.
In particular, if $\|v\| = 1$, then: $$f'(p)(v) = \|\nabla f(p)\| \cos(\theta)$$ So we can conclude that:
- The gradient indicates the direction in which the function increases the fastest.
- The norm of the gradient is the amplitude of that growth.
My question is: How can we conclude both things just by the fact that $f'(p)(v) = \|\nabla f(p)\| \cos(\theta)$? I don't get how my teacher arrived at that conclusions. He didn't give any additional explanation, so I assume that this must be trivial and there's some detail that I'm not seeing.
Fix the point $p \in \mathbb{R}^n$. Since $-1 \leq \cos(\theta) \leq 1$, we have $$-\Vert \nabla f(p) \Vert \leq \Vert \nabla f(p) \Vert \cos(\theta) \leq \Vert \nabla f(p) \Vert.$$ Since $f'(p)v = \Vert \nabla f(p)\Vert \cos(\theta)$, it follows that $$-\Vert \nabla f(p) \Vert \leq f'(p)v \leq \Vert \nabla f(p) \Vert.$$ Therefore, for a fixed $p \in \mathbb{R}^n$ (and varying unit vector $v \in \mathbb{R}^n$), the maximum value of $f'(p)v$ is $\Vert \nabla f(p) \Vert$, and the minimum value is $-\Vert \nabla f(p) \Vert$. Since $f'(p)v$ represents the infinitesimal rate of change of $f$ at $p$ in the direction $v$, this addresses your second bullet point.
The maximum value of $f'(p)v$ is attained when $\cos(\theta) = 1$, which for $\theta \in [0, 2\pi)$ means that $\theta = 0$, meaning that the angle between $v$ and $\nabla f(p)$ is zero, meaning that $v$ points in the same direction as $\nabla f(p)$. So, the direction $v$ in which $f'(p)v$ is maximum -- i.e., the direction of maximum change -- is the direction that $\nabla f(p)$ points. This addresses the first bullet point.