Given the function $$F(X,Y,Z) = \alpha^TXYZ$$ in which $X, Y, Z $ are matrices of size $n \times n$ and $\alpha$ is a vector of size $n \times 1$, how to compute the derivative of $F$ with respect to $Y$?
Actually I found some related questions but did not help.
Edit: if the function is of the form: $F(X,Y,Z) = \alpha^TXYZ\beta$, then based on the Matrix Cookbook, derivative is : $f' = (\alpha^T X)^T (Z\beta)^T$, but if there is no $\beta$, then the dimensions do not match.
Thank you,
Let ${\mathcal E}$ be the 4th-order tensor with components $$\eqalign{ {\mathcal E}_{ijkl} &= \delta_{ik}\,\delta_{jl} \cr }$$ Using this tensor, we can calculate the differential and gradient of the function as $$\eqalign{ f &= a^TXYZ \cr \cr df &= a^T(X\,dY\,Z) \cr &= a^T(X\,{\mathcal E}\,Z^T):dY \cr \cr \frac{\partial f}{\partial Y}&= a^TX\,{\mathcal E}\,Z^T \cr }$$ As expected, the gradient of a vector wrt a matrix is a 3rd-order tensor.
If you are unable to work with tensors, you can vectorize the differential to obtain $$\eqalign{ {\rm vec}(df) &= {\rm vec}(a^TX\,dY\,Z) \cr df &= (Z^T\otimes a^TX)\,{\rm vec}(dY) \cr &= (Z^T\otimes a^TX)\,dy \cr \cr \frac{\partial f}{\partial y}&= Z^T\otimes a^TX \cr }$$ which is an ordinary matrix quantity.
This is equivalent to the previous result, if you swap the order of the factors and replace the kronecker product symbol with the ${\mathcal E}$ tensor.