Let $\cal{J}(y) : \mathbb{R}^n \rightarrow \mathbb{R}$.
Furthermore, suppose $y := h(x)$ with the nonlinear mapping $h : \mathbb{R}^n \rightarrow \mathbb{R}^n$ twice differentiable. The Jacobian of $h$ is diagonal (i.e., $y_i = h_i(x_i)$ for all $i = 1 \ldots n$).
I want to calculate the second directional derivative of $J$ with respect to the variable $y$ along an arbitrary direction $v \in \mathbb{R}^n$ - as a function of the second derivative of $J$ wrt $x$ (along $v$) and that of the mapping $h$.
The chain rule gives:
$$ \frac{\partial \cal J}{\partial x} = \frac{\partial \cal J}{\partial y} \frac{\partial y}{\partial x}. $$
Here I'm following the standard convention (gradients = "row" vectors).
Then, I calculate the directional derivative of the (transposed) equation above - and things become a bit confusing...
$$ \frac{\partial \cal^2 J}{\partial x^2} \cdot v = \left( \frac{\partial y}{\partial x} \right)^T \left( \frac{\partial \cal^2 J}{\partial y^2} \right) \left( \frac{\partial y}{\partial x} \right) \cdot v + \left( \frac{\partial^2 y}{\partial x^2} \otimes v \right)^T \left( \frac{\partial \cal J}{\partial y} \right)^T, $$ where the $\otimes$ denotes the tensor-vector product operation.
Furthermore, suppose the Jacobian $\displaystyle \frac{\partial y}{\partial x}$ is non-singular over $\mathbb R^n$, and define $$ \widehat{v} = \frac{\partial y}{\partial x} \, v. $$
Then: $$ \frac{\partial \cal^2 J}{\partial y^2} \cdot \widehat{v} = \left( \frac{\partial y}{\partial x} \right)^{-T} \frac{\partial \cal^2 J}{\partial x^2} \left( \frac{\partial y}{\partial x} \right)^{-1} \widehat{v} - \left( \frac{\partial y}{\partial x} \right)^{-T} \left( \frac{\partial^2 y}{\partial x^2} \otimes \left( \frac{\partial y}{\partial x}\right)^{-1} \widehat{v} \right)^T \left( \frac{\partial \cal J}{\partial y} \right)^T. $$
Does this look OK? The matrix dimensions seem to match up...
I hope this answers your question, to some extent at least. Keeping track of all the matrices can be a nuisance, and I find it easier to operate with polynomials.
It is good to view this in terms of Taylor polynomials. We have $$ h(x+a) = h(x)+(Dh(x))a+\frac12(D^2h(x))(a,a)+O(|a|^3) $$ where $D^2h(x)$ is a symmetric map $\mathbb R^n\times\mathbb R^n\to\mathbb R^n$. Similarly $$ J(y+b) = J(y)+(DJ(y))b+\frac12(D^2J(y))(b,b)+O(|b|^3). $$ Note that $h$ takes values in $\mathbb R^n$ and $J$ takes values in $\mathbb R$ although the expressions look the same. Putting these together yields $$ \begin{split} J(h(x+a)) &= J(h(x)+(Dh(x))a+\frac12(D^2h(x))(a,a)+O(|a|^3)) \\&= J(h(x))+(DJ(h(x)))[(Dh(x))a+\frac12(D^2h(x))(a,a)]+\frac12(D^2J(h(x)))((Dh(x))a,(Dh(x))a)+O(|a|^3). \end{split} $$ But we should have $$ J(h(x+a)) = J\circ h(x)+(D(J\circ h)(x))a+\frac12(D^2(J\circ h)(x))(a,a)+O(|a|^3). $$ These two polynomial expressions (ignoring the error term) must be equal for all $a$. Comparing the terms gives first the chain rule $$ (D(J\circ h)(x)) = (DJ(h(x)))(Dh(x)) $$ and then for the second derivative $$ (D^2(J\circ h)(x))(a,a) = (DJ(h(x)))(D^2h(x))(a,a)+(D^2J(h(x)))((Dh(x))a,(Dh(x))a). $$ If $H_J(y)$ is the Hessian matrix of $J$ at $y$, then $(D^2J(y))(b,b)=b^TH_J(y)b$. The second term of $D^2(J\circ h)$ corresponds to the matrix $Dh(x)^TH_J(h(x))Dh(x)$, but the first one cannot be written as neatly in matrix form. With indices, we have $$ (D^2(J\circ h)(x))(a,a) = \sum_{i,j,k}\partial_iJ(h(x))\partial_j\partial_kh_i(x)a_ja_k + \sum_{i,j,k,l}\partial_ih_j(x)\partial_j\partial_kJ(h(x))\partial_kh_l(x)a_ia_l . $$