How can we calculate the derivative of Hessian, namely, the third derivative?

65 Views Asked by At

I have a problem when calculate the following derivative which is supposed to be $\mathbf{R}^{2\times 2}$ since $t\in \mathbf{R}$. According to the chain rule, $$\tag{1} \frac{d}{dt}\nabla^2f(x+tv)|_{t=0}=\nabla^3f(x+tv)|_{t=0}\cdot \frac{d(x+tv)}{dt}=\nabla^3f(x)\cdot v $$ where $x,v\in \mathbf{R}^2$. The gradient of $f(x)$ w.r.t $x$ is $$ \nabla f(x)=\left[\begin{array}{c}\frac{\partial f}{\partial x_1}\\ \frac{\partial f}{\partial x_2}\end{array}\right] $$ The Hessian is $$ \nabla^2 f(x)=\nabla\nabla^T f(x)=\left[\begin{array}{cc}\frac{\partial^2 f}{\partial x_1^2}&\frac{\partial^2 f}{\partial x_2\partial x_1}\\ \frac{\partial^2 f}{\partial x_1\partial x_2}&\frac{\partial^2 f}{\partial x_2^2}\end{array}\right] $$

However, how can I calculate $\nabla^3f(x)$? My strategy is to take derivative on each entry of $\nabla^2f(x)$ w.r.t $x$, i.e., a column vector $\frac{\partial[\nabla^2f]_{ij}}{\partial x}=[\frac{\partial[\nabla^2f]_{ij}}{\partial x_1};[\frac{\partial[\nabla^2f]_{ij}}{\partial x_2}]$, $$ \nabla^3f(x)=\frac{\partial \nabla^2f(x)}{\partial x}=\left[\begin{array}{cc} \frac{\partial^3 f}{\partial x_1^3}&\frac{\partial^3 f}{\partial x_2\partial x_1^2}\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}&\frac{\partial^3 f}{\partial x_2^2\partial x_1}\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}&\frac{\partial^3 f}{\partial x_2^2\partial x_1}\\ \frac{\partial^3 f}{\partial x_1\partial x_2^2}&\frac{\partial^3 f}{\partial x_2^3} \end{array}\right]$$ The result is $\mathbf{R}^{4\times 2}$.Continue our calculation $$ \nabla^3f(x)\cdot v=\left[\begin{array}{cc} \frac{\partial^3 f}{\partial x_1^3}&\frac{\partial^3 f}{\partial x_2\partial x_1^2}\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}&\frac{\partial^3 f}{\partial x_2^2\partial x_1}\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}&\frac{\partial^3 f}{\partial x_2^2\partial x_1}\\ \frac{\partial^3 f}{\partial x_1\partial x_2^2}&\frac{\partial^3 f}{\partial x_2^3} \end{array}\right]\left[\begin{array}{c}v_1\\v_2\end{array}\right]= \left[\begin{array}{c} \frac{\partial^3 f}{\partial x_1^3}v_1+\frac{\partial^3 f}{\partial x_2\partial x_1^2}v_2\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}v_1+\frac{\partial^3 f}{\partial x_2^2\partial x_1}v_2\\ \frac{\partial^3 f}{\partial x_1^2\partial x_2}v_1+\frac{\partial^3 f}{\partial x_2^2\partial x_1}v_2\\ \frac{\partial^3 f}{\partial x_1\partial x_2^2}v_1+\frac{\partial^3 f}{\partial x_2^3}v_2 \end{array}\right] $$ The result is $\mathbf{R}^4$ not $\mathbf{R}^{2\times 2}$. Where is wrong? Let's think about in the view of linear approximation $$\tag{2} \nabla^2f(x+tv)\approx\nabla^2 f(x) + \nabla^3f(x)v\cdot t $$ (2) does not make sense due to the inconsistent dimension on the RHS. Specifically, the first term is $2\times 2$ but the second term is $4\times 1$. Where is wrong with my derivations? Still, $$\tag{3} \frac{d}{dt}\nabla^2f(x+tv)|_{t=0}=\nabla^3f(x)v $$ Who can help me point out the bugs, fix the bugs and get the correct form? Any instruction will be appreciated.