Lipschitz continuous hessian for the twice differentiable function $f$ is defined as, for any $x, y$,
$$\|\nabla^2 f(x) - \nabla^2 f(y)\| \leq M\|x-y\|$$ for some postive $M$, how to derive the upper bound of
$|f(y) - f(x) - \nabla f(x)^T(y-x) - \frac{1}{2}(y - x) \nabla^2 f(x)^T (y-x)| \leq \frac{M}{6}\|y-x\|^3$
$\newcommand{\d}{\mathrm{d}}$ We will here denote the inner product of $x$ and $y$ by $\langle x, y\rangle. Since $f$ is twice continuously differentiable, we may use the fact that
$$ \begin{aligned} f(y) = f(x) + \langle \nabla f(x), y-x\rangle + \tfrac{1}{2} \langle \nabla^2 f(x+\lambda (y-x))(y-x), y-x \rangle \end{aligned}\tag{1}\label{1} $$
for some $\lambda\in [0,1]$. We may write this equivalently as (and this is maybe more convenient)
$$ \begin{aligned} f(y) = f(x) + \langle \nabla f(x), y-x\rangle + \int_0^1 \int_0^\tau \left\langle \nabla^2 f(x+\alpha (y-x))(y-x), y-x\right\rangle \d \alpha\ \d \tau. \end{aligned}\tag{2}\label{2} $$
Since $\nabla^2 f$ is $M$-Lipschitz,
$$ \begin{aligned} &\|\nabla^2 f(x+\alpha(y-x)) - \nabla^2 f(x)\| \leq M\alpha \|x-y\|, \\ \Leftrightarrow& \nabla^2 f(x+\alpha(y-x)) = \nabla^2 f(x) + H, \end{aligned}\tag{3} $$
where $\|H\|\leq M\alpha \|x-y\|$. We may now easily plug (3) into (2) and prove the original inequality. In particular, because of \eqref{2}:
$$ \begin{aligned} f(y) &= f(x) + \langle \nabla f(x), y-x\rangle + \int_0^1 \int_0^\tau \left\langle \nabla f^2(x)(y-x), y-x\right\rangle \d \alpha\ \d \tau\\ &+ \int_0^1 \int_0^\tau \left\langle H(y-x), y-x\right\rangle \d \alpha\ \d \tau, \end{aligned}\tag{4}\label{4} $$
where the first integral in \eqref{4} is
$$ \begin{aligned} \int_0^1 \int_0^\tau \left\langle \nabla f^2(x)(y-x), y-x\right\rangle \d \alpha\ \d \tau = \tfrac{1}{2}\left\langle \nabla f^2(x)(y-x), y-x\right\rangle \end{aligned} $$
and the second integral's absolute value is
$$ \begin{aligned} \left|\int_0^1 \int_0^\tau \left\langle H(y-x), y-x\right\rangle \d \alpha\ \d \tau\right| &\leq \int_0^1 \int_0^\tau \|H\|\cdot \|y-x\|^2 \d \alpha\ \d \tau\\ &\leq \int_0^1 \int_0^\tau M \alpha \|y-x\|^3 \d \alpha\ \d \tau\\ &=\tfrac{M}{6}\|y-x\|^3. \end{aligned} $$