I'm trying to write down the Hessian of a very ugly function for some numerical optimization. The function $f(\alpha,\beta)$ is real valued, but while $\alpha$ is a scalar, $\beta$ is a vector. So my problem is I'm getting a transpose symmetry and I don't know if that's correct.
I got this $$ \frac{\partial^2 f(\alpha,\beta)}{\partial \alpha \partial \beta} = (\frac{\partial^2 f(\alpha,\beta)}{\partial \beta \partial \alpha})^T $$
I verified and the partial derivatives are everywhere-continous. So they are supposed to be equal. I remember that a scalar function differentiated by a vector is a row vector, and of course the scalar differentiation of that vector should yield a vector with the same shape, so basically I think I messed up the algebra somewhere. But I'm not sure if this might be correct, because I've been trying to find the error for hours without luck.
It's been a really long time since I've done any matrix calculus and I cant find any reference on this sort problem. I guess the question is pretty stupid, so thanks for answering!
EDIT: I will add an example so you have a better idea of what I'm talking about: $$ f(\alpha, \beta) = ( c - \alpha P - X \beta)'(c - \alpha P - X \beta) \quad c\in R \enspace \alpha \in R^n \enspace X \in R^{(n x m)} $$ $$ \frac{\partial^2 f}{\partial \alpha \partial \beta} = 2P^{\prime}X$$ $$ \frac{\partial^2 f}{\partial \beta \partial \alpha} = 2X^{\prime}p$$ So basicaly the shape of the first is $1 \times m$ while the shape of the second is $ m \times 1$.
again, this is probably completely idiotic, but aren't this exactly the sort of things one forgets?
Sometimes it's easier to consider the matrix version of a problem.
Instead of the vectors {$\beta,c,p$} rewrite the function in terms of the matrices {$B,C,P$} and the Frobenius product $\big(M\!:\!M=tr(M^T\!M)\big)$ to obtain: $$ \eqalign { f &= (C-\alpha P - XB):(C-\alpha P - XB) \cr } $$ The differential of this function is $$ \eqalign { df &= -2\,(C-\alpha P - XB):(d\alpha P + XdB) \cr &= -2\,(C-\alpha P - XB):d\alpha P - 2\,(C-\alpha P - XB):XdB \cr &= -2\,(C-\alpha P - XB):Pd\alpha - 2\,X^T(C-\alpha P - XB):dB \cr } $$ And the derivatives are $$ \eqalign { h &= \frac {\partial f} {\partial\alpha} &= -2\,P:(C-\alpha P - XB) \cr G &= \frac {\partial f} {\partial B} &= -2\,X^T(C-\alpha P - XB) \cr } $$ Next take the differentials of {$G,h$} $$ \eqalign { dh &= 2\,P:XdB \cr &= 2\,X^TP:dB \cr dG &= 2\,X^TPd\alpha \cr } $$ The corresponding derivatives are $$ \eqalign { \frac {\partial h} {\partial B} &= 2\,X^TP \cr \frac {\partial G} {\partial\alpha} &= 2\,X^TP \cr } $$ Thus, in the matrix version of the problem, the results are seen to be identical.