On Chapter 2, on unconstrained optimization, of Nocedal & Wright's Numerical Optimization, beginning on page 20, the authors verify that the steepest descent direction, $-\nabla f_k$ provides the direction along which the objective $f$ decreases most rapidly. I have some trouble following their proof.
Starting from Taylor's theorem,
$$f(x+p) = f(x) + \nabla f(x)^T p + \frac{1}{2} p^T \nabla^2 f(x+tp)p$$
We have for the iterate $x_k$, step direction $p$ and step-length $\alpha$, $$f(x_k + \alpha p) = f(x_k) + \alpha p^T \nabla f_k + \frac{1}{2} \alpha^2 p^T \nabla^2 f(x_k + tp)p$$
for some $t \in (0,\alpha)$. The text claims:
The rate of change in $f$ along the direction $p$ at $x_k$ is simply the coefficient of $\alpha$, namely $p^T \nabla f_k$. Hence, the unit direction $p$ of most rapid decrease is the solution to the problem $$\min_{p} p^T \nabla f_k$$ subject to $|| p || = 1$.
How does the quoted text follow from Taylor's theorem? I can understand in general that the rate of change in $f$ along the direction of $p$ is the projection of $\nabla f_k$ on $p$ or $$\frac{p^T \nabla f_k}{||p||}$$ where we take $||p||=1$ here, but why did the text need to bring up Taylor's theorem here?