Suppose $\mathbf{X}$ is a $n\times n$ positive definite matrix, $\mathbf{A}$ is a $n\times n$ constant matrix, and $b$ is a real scalar. The matrix power $\mathbf{X}^b$ is defined to be $\mathbf{U}\boldsymbol{\Lambda}^b\mathbf{U}^T$ where $\mathbf{U}\boldsymbol{\Lambda}\mathbf{U}^T$ is the eigen-decomposition of $\mathbf{X}$.
How to compute the derivative of $f(\mathbf{X})=\mathrm{trace}(\mathbf{X}^b\mathbf{A})$ with respect to $\mathbf{X}$? Can anyone help me solve this problem? Thanks.
-John
First, find the gradient for a simple concrete case, like $b=3$ $$\eqalign{ f_3 &= {\rm tr}(AX^3) \cr &= A^T:XXX \cr\cr df_3 &= A^T:d(XXX) \cr &= A^T:(dX\,XX+X\,dX\,X+XX\,dX) \cr &= A^T:(dX\,XX+X\,dX\,X+XX\,dX) \cr &= (A^TX^TX^T+X^TA^TX^T+X^TX^TA^T):dX \cr &= (XXA+XAX+AXX)^T:dX \cr\cr \frac{\partial f_3}{\partial X} &= (XXA+XAX+AXX)^T \cr &= \Big(\sum_{k=0}^2 \, X^k\,A\,X^{2-k}\Big)^T \cr }$$ Generalizing the formula to arbitrary values of $b$ yields
$$\eqalign{ \frac{\partial f_b}{\partial X} &= \Big(\sum_{k=0}^{b-1} \, X^k\,A\,X^{b-1-k}\Big)^T \cr }$$ In this derivation, I used the Frobenius ( : ) product for algebraic convenience. You can replace it with the trace if you prefer, since they are completely equivalent $$A:B={\rm tr}(A^TB)$$