This is a claim from Nocedal and Wright's text on optimization 2nd ed p.23.
Suppose $f$ is $C^2$, then $\exists t \in (0,1)$, such that
$\nabla f(x+p) = \nabla f(x) + \int\limits_0^1 \nabla^2 f(x+ p)pdt$
Equivalently,
$\nabla f(x+p) = \nabla f(x) + \nabla^2f(x)p + \int\limits_0^1 (\nabla^2 f(x+ p) - \nabla^2 f(x))pdt$
However, next comes the confounding claim:
Claim: Since $\nabla f(\cdot)$ is continuous, therefore $$\int\limits_0^1 (\nabla^2 f(x+ tp) - \nabla^2 f(x))pdt = o(\|p\|)$$
Note: in the text, it was defined: if $\eta(\cdot):\mathbb{R} \to \mathbb{R}, \eta(\nu) = o(\nu)$ if $\eta(\nu)/\nu \to 0$ as $\nu \to 0$ or $\nu \to \infty$
Questions:
How was continuity used? For me, a vector valued function is continuous if all its components are continuity. I can't see how to go from $\dfrac{\partial f(\cdot)}{\partial x_i}, \forall i \in \{1, \ldots, n\}$ is continuous to that integral with the Hessian being continuous.
How do we justify the $o(\|p\|)$ bound?
$$| \int\limits_0^1 (\nabla^2 f(x+ tp) - \nabla^2 f(x))~p ~dt | \leq \int\limits_0^1 |(\nabla^2 f(x+ tp) - \nabla^2 f(x))p |~dt$$
By Cauchy Swartz inequality we get;
$$| \int\limits_0^1 (\nabla^2 f(x+ tp) - \nabla^2 f(x))~p ~dt | \leq \int\limits_0^1 \| \nabla^2 f(x+ tp) - \nabla^2 f(x) \|~\|p \| ~ dt $$
By mean value theorem for integrals; for all $p$ there is a scalar $s_p \in [0,1]$ such that $$\int\limits_0^1 \| \nabla^2 f(x+ tp) - \nabla^2 f(x) \|~\|p \| ~ dt = \| \nabla^2 f(x+ s_pp) - \nabla^2 f(x) \|~\|p \| $$ Therefore we finally arrive
$$ \frac{| \int\limits_0^1 (\nabla^2 f(x+ tp) - \nabla^2 f(x))~p ~dt |}{\|p\|} \leq \| \nabla^2 f(x+ s_pp) - \nabla^2 f(x) \| $$
Now by letting $p \to 0$ we get the desired result.