Let's consider three sets of vectors, dependent on each other as follows: $\textbf{A}=\textbf{A}(\textbf{B})$, $\textbf{B}=\textbf{B}(\textbf{C})$
Using the chain rule, $ \dfrac{d\textbf{A}}{d\textbf{C}}=\dfrac{d\textbf{A}}{d\textbf{B}}\dfrac{d\textbf{B}}{d\textbf{C}}$
How can one expand this rule to calculate the second order derivative matrix $\dfrac{d^2\textbf{A}}{d\textbf{C}^2}$ if $\dfrac{d^2\textbf{A}}{d\textbf{B}^2}, \dfrac{d^2\textbf{B}}{d\textbf{C}^2}$ are also known?
I was thinking something like $\dfrac{d^2\textbf{A}}{d\textbf{C}^2}= \dfrac{d\textbf{B}}{d\textbf{C}}^T \dfrac{d^2\textbf{A}}{d\textbf{B}^2} \dfrac{d\textbf{B}}{d\textbf{C}} + \dfrac{d\textbf{A}}{d\textbf{B}} \dfrac{d^2\textbf{B}}{d\textbf{C}^2}$ that reminds of the multivariable function second-order chain rule, but I don't think it holds.
Thanks for all the tips!
edit: I am using the Python NumPy package, so I can add over any matrix axes using tensordot.
EDIT: Using einsum of the NumPy package, one can write:
dAdC=np.einsum('ik,km->im',dAdB,dBdC)
DADC=np.einsum('ikl,km,ln->imn',DADB,dBdC,dBdC)+np.einsum('ik,kmn->imn',dAdB,DBDC)
for the first (small d) and second order (capital D) derivatives, respectively.
Depending on how the tensors are combined, I think your expression might make sense now.
I've used $x$, $y$ and $z$ for $C$, $B$ and $A$ respectively in the work below, hope it makes sense:
Let $x\in \mathbb{R}^m$ and $y \in \mathbb{R}^n$. Let $y$ depend on $x$ so we can find the Jacobian of $y$ as a function of $x$.
$$ J_x y \in \mathbb{R}^{n \times m} \qquad \left. J_{x} y \right|_{i,j} = \frac{\partial y_{i}}{\partial x_{j}} $$
where $J|_{i,j}$ denotes the $(i,j)$ element of the Jacobian.
Let $z \in \mathbb{R}^{k}$ be a function of $y$. We can also find the Jacobian of $z$ as a function of $y$.
$$ J_y z \in \mathbb{R}^{k \times n} \qquad \left. J_{y} z \right|_{t,i} = \frac{\partial z_{t}}{\partial y_{i}} $$
We can view $z$ as a function of $x$ by composition
$$ x \rightarrow y \rightarrow z \qquad \qquad \mathbb{R}^{m} \rightarrow \mathbb{R}^{n} \rightarrow \mathbb{R}^{k} $$
The Jacobian of $z$ as a function of $x$:
$$ J_x z \in \mathbb{R}^{k \times m} \qquad \left. J_{x} z \right|_{t,j} = \frac{\partial z_{t}}{\partial x_{j}} $$
The term
$$ \frac{\partial z_{t}}{\partial x_{j}} $$
Can be written as a sum
$$ \frac{\partial z_{t}}{\partial x_{j}} = \sum_{i=1}^{n} \frac{\partial z_{t}}{\partial y_{i}} \frac{\partial y_{i}}{\partial x_{j}} $$
Let's drop the summation and use Einstein notation, i.e. we assume summation over indices that are repeated, so we can write
$$ \frac{\partial z_{t}}{\partial x_{j}} = \frac{\partial z_{t}}{\partial y_{i}} \frac{\partial y_{i}}{\partial x_{j}} $$
This basically represents matrix multiplication. The Jacobians are matrices (2-tensors) and we have
$$ J_{x} z = J_{y} z \; \; J_{x} y $$
What about the Hessian? - is there a chain rule for Hessians?
The Hessian of $y$ as a function of $x$ is
$$ H_{x} y = \nabla_{x} J_{x} y $$
It is a 3-tensor $$ \left. H_{x} y \right\vert_{q,i,j} = \frac{\partial}{\partial x_{q}} J_{x} y \vert_{i,j} $$
$$ \left. H_{x} y \right\vert_{q,i,j} = \frac{\partial}{\partial x_{q}} \frac{\partial y_{i}}{\partial x_{j}} = \frac{\partial^{2} y_{i}}{\partial x_{q} \, x_{j}} $$
The Hessian of $z$ as a function of $y$ is
$$ \left. H_{y} z \right\vert_{r,t,i} = \frac{\partial}{\partial y_{r}} \frac{\partial z_{t}}{\partial y_{i}} = \frac{\partial^{2} z_{t}}{\partial y_{r} \, y_{i}} $$
What about the Hessian of $z$ as a function $x$?
$$ \left. H_{x} z \right\vert_{s,t,j} = \frac{\partial}{\partial x_{s}} \frac{\partial z_{t}}{\partial x_{j}} = \frac{\partial^{2} z_{t}}{\partial x_{s} \, x_{j}} $$
We can use the expression derived earlier for $\partial z_{t}/\partial x_{j}$ which is an element in the Jacobian of $z$ with respect to $x$:
$$ \left. H_{x} z \right\vert_{s,t,j} = \frac{\partial}{\partial x_{s}} \left[ \frac{\partial z_{t}}{\partial y_{i}} \frac{\partial y_{i}}{\partial x_{j}} \right] $$
Remembering that we are using the convention where repeated indices denote indices over which we should sum. The evaluates with the product rule to the following
$$ \begin{aligned} \left. H_{x} z \right\vert_{s,t,j} & = \frac{\partial}{\partial x_{s}} \left[ \frac{\partial z_{t}}{\partial y_{i}} \frac{\partial y_{i}}{\partial x_{j}} \right] \\ &= \frac{\partial z_{t}}{\partial y_{i}} \; \frac{\partial}{\partial x_{s}} \frac{\partial y_{i}}{\partial x_{j}} + \frac{\partial}{\partial x_{s}} \frac{\partial z_{t}}{\partial y_{i}} \; \frac{\partial y_{i}}{\partial x_{j}} \\ &= \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial^{2} y_{i}}{\partial x_{s} x_{j}} + \frac{\partial}{\partial x_{s}} \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial y_{i}}{\partial x_{j}} \end{aligned} $$
The first part of the second term contains components of all three variables, let's focus on it
$$ \frac{\partial}{\partial x_{s}} \frac{\partial z_{t}}{\partial y_{i}} $$
Assuming equality of mixed partials, we can write it as
$$ \frac{\partial}{\partial y_{i} } \frac{\partial z_{t}}{\partial x_{s}} $$
Repeating the substitution made earlier,
$$ \begin{aligned} \frac{\partial}{\partial y_{i} } \frac{\partial z_{t}}{\partial x_{s}} &= \frac{\partial}{\partial y_{i} } \left[ \frac{\partial z_{t}}{\partial y_{u}} \frac{\partial y_{u}}{\partial x_{s}} \right] \\ &= \frac{\partial}{\partial y_{i} } \frac{\partial z_{t}}{\partial y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} + \frac{\partial z_{t}}{\partial y_{u}} \; \frac{\partial}{\partial y_{i} } \frac{\partial y_{u}}{\partial x_{s}} \\ &= \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} + \frac{\partial z_{t}}{\partial y_{u}} \; \frac{\partial}{\partial y_{i} } \frac{\partial y_{u}}{\partial x_{s}} \\ &= \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} + \frac{\partial z_{t}}{\partial y_{u}} \; \frac{\partial}{\partial x_{s} } \frac{\partial y_{u}}{\partial y_{i} } \\ &= \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} + \frac{\partial z_{t}}{\partial y_{y}} \; \frac{\partial}{\partial x_{s} } \delta_{i,u} \\ &= \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} + \frac{\partial z_{t}}{\partial y_{u}} \; 0 \\ &= \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} \end{aligned} $$
Now replace this expression into the second term of the earlier equation for the Hessian of $z$ with respect to $x$
$$ \begin{aligned} \left. H_{x} z \right\vert_{s,t,j} &= \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial^{2} y_{i}}{\partial x_{s} x_{j}} + \frac{\partial}{\partial x_{s}} \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial y_{i}}{\partial x_{j}} \\ &= \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial^{2} y_{i}}{\partial x_{s} x_{j}} + \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} \, \frac{\partial y_{i}}{\partial x_{j}} \end{aligned} $$
Now we can replace the partial derivative notation with the Jacobian/Hessian notation to get an expression for the Hessian of $z$ as a function of $x$ in terms of the intermediate Jacobians/Hessians:
$$ \begin{aligned} \left. H_{x} z \right\vert_{s,t,j} &= \frac{\partial z_{t}}{\partial y_{i}} \, \frac{\partial^{2} y_{i}}{\partial x_{s} x_{j}} + \frac{\partial^{2} z_{t}}{\partial y_{i} y_{u}} \; \frac{\partial y_{u}}{\partial x_{s}} \, \frac{\partial y_{i}}{\partial x_{j}} \\ &= \left. J_{y} z \right|_{t,i} \, \left. H_{x} y \right|_{s,i,j} + \left. H_{y} z \right|_{i,t,u} \; \left. J_{x} y \right|_{{u,s}} \, \left. J_{x} y \right|_{i,j} \end{aligned} $$
So for the first term:
$$ \left. J_{y} z \right|_{t,i} \, \left. H_{x} y \right|_{s,i,j} $$ we evaluate the tensor product along the second dimension of the Jacobian of $z$ and the second dimension of the Hessian of $y$. This is because the shared index $i$ appears second for $J_{y} z$ and second for $H_{x} y$. The Numpy tensordot function accepts an argument to set which dimensions to carry out the product on.
For the second term $$ \left. H_{y} z \right|_{i,t,u} \; \left. J_{x} y \right|_{{u,s}} \, \left. J_{x} y \right|_{i,j} $$ There are two tensor products and the shared indices indicate over which dimensions to take the products.