Theorem: Suppose that $\space f:R^{n} \rightarrow R$ is a continuously differentiable and that $p \in R^n$ . Then we have that $$f (x + p) = f (x) + ∇ f (x + t p)^T p$$ where $t \in (0,1)$.
And if $f$ is twice continuously differentiable, we have that $$f (x + p) = f (x) + ∇ f (x)^T p + \frac{1}{2}p^T ∇^2 f (x + t p) p$$ where $t \in (0,1)$.
I found this statement in a book without its proof. I tried searching several books of calculus to find it but couldn't find a simple proof of this statement.
This is the way I would do it. Define $g:[0,1]\to\mathbb R$ by $$ g(t)=f(x+tp). $$ Since $f$ is differentiable, so is $g$. By the Mean Value Theorem, there exists $t\in(0,1)$ with $$ f(x+p)-f(x)=g(1)-g(0)=g'(t). $$ By the Chain Rule, $$ g'(t)=\sum_{j=1}^n\frac{\partial f}{\partial x_j}(x+tp)\,p_j=\nabla f(x+tp)^Tp. $$ Using the Chain Rule again, \begin{align} g''(t)&=\sum_{k=1}^n\frac{\partial}{\partial x_k}\left(\sum_{j=1}^n\frac{\partial f}{\partial x_j}(x+tp)\,p_j\right)\,p_k =\sum_{k=1}^n\sum_{j=1}^n\frac{\partial^2 f}{\partial x_k\partial x_j} (x+tp)\,p_j\,p_k\\ \ \\ &=p^TH(x+tp)\,p, \end{align} where $H$ is the Hessian Matrix of $f$. The notation $\nabla^2$, while bad and usually used for something else, is more or less coherent since in a sense the Hessian is the gradient of the gradient.