Let's say you have a vectorial function $f(x,y)$ where $x \in \mathbb{R}^n$ and $y \in \mathbb{R}^m$, then you compute the Gradient of $f$ with respect to the vector $x$, resulting in $\nabla_x f(x,y)$ and then take the derivative of the result with respect to $y$ resulting in an $n\times m$ matrix $\nabla^2_{xy}f(x,y)$.
By the other hand, you also take the gradient of $f$ with respect to the vector $y$, ie, $\nabla_y f(x,y)$ then differentiate with respect to the vector $x$ obtaining the matrix $\nabla^2_{yx}f(x,y)$. Are these matrices (in general, for well behaved functions, under Schwartz' theorem conditions) the transpose of each other, just like in Schwartz's theorem for partial derivatives? Or does the result not extend to vectorial computations like that?
Because I'm trying to compute those matrices for a certain function and they are pretty far from equal. Unless I'm doing something wrong...?