Questions about different formulations of the Taylor expansion terms

74 Views Asked by At

I'm reading Numerical Optimization from Nocedal/Wright, and was playing around the matrix notation of the Taylor expansion, regarding which I would have two questions.

We have a 3-times differentiable function $f\,:\,\mathbb{R}^n\rightarrow \mathbb{R}$, for which the Taylor expansion around the point $a$, up to the quadratic term, is the following: where $x,\,\,a,\,\,p\,\in\,\mathbb{R}^n$, and $p$ is the distance between $a$ and $x$. ($x = a + p$)

$$ f\left( x \right) =f\left( a+p \right) \approx f\left( a \right) +\nabla f\left( a \right) ^Tp+\frac{1}{2}p^T\nabla ^2f\left( a \right) p $$

  1. My first question is: Can the quadratic term be expressed the following way?

$$ \frac{1}{2}p^T\nabla ^2f\left( a \right) p\,\,\overset{?}{=}\,\,\frac{1}{2}\nabla \left( \nabla f\left( a \right) ^Tp \right) ^Tp $$

(I tried to prove it with mapping matrix indices with each other, but always got lost somewhere.)

  1. My second question stands only if the answer to the first is yes. Is it possible to write the third-order term in this manner?

$$ \frac{1}{6}\nabla \left( p^T\nabla ^2f\left( a \right) p \right) ^Tp $$

I know that these forms would be of little practical use, I'm just asking out of curiosity.

EDIT:

I've made some calculations, and it worked out in the case I tried, but I still can't prove it:

$$ f\left( a \right) =2a_{1}^{3}a_{2}^{4}+a_{1}^{2} $$

$$ a=\left[ \begin{array}{c} a_1\\ a_2\\ \end{array} \right] =\left[ \begin{array}{c} 1\\ 2\\ \end{array} \right] \,\,\,\,\,\,\,\,\,\,\,\,p=\left[ \begin{array}{c} p_1\\ p_2\\ \end{array} \right] =\left[ \begin{array}{c} 2\\ 3\\ \end{array} \right] $$

$$ \nabla f\left( a \right) =\left[ \begin{array}{c} 6a_{1}^{2}a_{2}^{4}+2a_1\\ 8a_{1}^{3}a_{2}^{3}\\ \end{array} \right] $$

$$ \nabla ^2f\left( a \right) =\left[ \begin{matrix} 12a_1a_{2}^{4}+2& 24a_{1}^{2}a_{2}^{3}\\ 24a_{1}^{2}a_{2}^{3}& 24a_{1}^{3}a_{2}^{2}\\ \end{matrix} \right] $$

$$ p^T\nabla ^2f\left( a \right) p=\left[ \begin{matrix} 2& 3\\ \end{matrix} \right] \left[ \begin{matrix} 194& 192\\ 192& 96\\ \end{matrix} \right] \left[ \begin{array}{c} 2\\ 3\\ \end{array} \right] =3944 $$

$$ \nabla \left( \nabla f\left( a \right) ^Tp \right) ^Tp=\nabla \left( \left[ \begin{matrix} 6a_{1}^{2}a_{2}^{4}+2a_1& 8a_{1}^{3}a_{2}^{3}\\ \end{matrix} \right] \left[ \begin{array}{c} p_1\\ p_2\\ \end{array} \right] \right) ^Tp= \\ =\nabla \left( 6a_{1}^{2}a_{2}^{4}p_1+2a_1p_1+8a_{1}^{3}a_{2}^{3}p_2 \right) ^Tp= \\ =\left[ \begin{array}{c} 12a_1a_{2}^{4}p_1+2p_1+24a_{1}^{2}a_{2}^{3}p_2\\ 24a_{1}^{2}a_{2}^{3}p_1+24a_{1}^{3}a_{2}^{2}p_2\\ \end{array} \right] ^T\left[ \begin{array}{c} p_1\\ p_2\\ \end{array} \right] = \\ =\left[ \begin{matrix} 964& 672\\ \end{matrix} \right] \left[ \begin{array}{c} 2\\ 3\\ \end{array} \right] =1928+2016=3944 $$

1

There are 1 best solutions below

0
On BEST ANSWER

You can do something like that, though you need to be a bit careful with your notation. $\nabla f(a)^Tp$ no longer depends on $x$, so if you differentiate it again, you will just the zero matrix. But I know what you mean: let's define $Jf(x)$ to be the Jacobian of $f$ at $x$ (a row vector, for $f:\mathbb{R}^n\to \mathbb{R}$). Then indeed you can write the second-order term as $$\frac{1}{2}J\left[J(x)p\right](a)p.$$ Here the inside $J(x)p$ is again a function of $x$, and so can be differentiated again.

The proof is straightforward, in coordinates:

$$Jf(x)p = \sum_{i=1}^n \frac{\partial f}{\partial x_i}(x) p_i$$ $$J[Jf(x)p](a)p = \sum_{j=1}^n \left[\frac{\partial}{\partial x_j}\left(\sum_{i=1}^n \frac{\partial f}{\partial x_i}(x)p_i\right)\right]_{x=a}p_j = \sum_{i,j=1}^n \frac{\partial^2 f}{\partial x_i\partial x_j}(a)p_ip_j = p^T Hf(a)p$$ where we have used equality of mixed partials and the fact that $p$ is a constant that does not depend on the $x_i$.