Suppose I have a scalar $J(n)$ and two vectors, $\mathbf{w}(n)$ and $\mathbf{x}(n)$. Now, suppose that $J(n)$ is a fairly straightforward function of $\mathbf{w}(n)$, and $\mathbf{w}(n)$ is actually a vector function of $\mathbf{x}(n)$ (and a complicated one, for that matter, although $\frac{\partial \mathbf{w}(n)}{\partial \mathbf{x}(n)}$ is relatively easy to calculate). In order to obtain the gradient of $J(n)$ with respect to $\mathbf{x}(n)$, I used
$$ \frac{\partial J(n)}{\partial \mathbf{x}(n)} = \frac{\partial \mathbf{w}(n)}{\partial \mathbf{x}(n)} \frac{\partial J(n)}{\partial \mathbf{w}(n)}.$$
Now, I'd like to calculate the Hessian of $J(n)$ with respect to $\mathbf{x}(n)$. The thing is, it seems to me it would be a lot easier to calculate $\frac{\partial^2 J(n)}{\partial \mathbf{w}^2(n)}$ first and then use the chain rule to obtain $\frac{\partial^2 J(n)}{\partial \mathbf{x}^2(n)}$ from the previous result. The problem is that I'm having some difficulty with it. I know that, in single variable calculus, if we have $y = f(u)$ and $u = g(x)$, then
$$\frac{d^2y}{dx^2} = \frac{d^2y}{du^2}\bigg(\frac{du}{dx}\bigg)^2 + \frac{dy}{du} \bigg( \frac{d^2u}{dx^2} \bigg),$$
but how does that apply to matrix calculus? I mean, $\frac{\partial^2 \mathbf{w}(n)}{\partial \mathbf{x}^2(n)}$ is a tensor, right? How do I deal with this situation? Is it even posible to do what I am thinking?