Chain rule in matrix derivation

236 Views Asked by At

I am having trouble differentiating $\frac {\partial(tr(AX^{-1}))}{\partial X}$. If I understand, how chain rule works in matrix derivation (if it works at all), then I'm getting this result: $$\frac {\partial(tr(AX^{-1}))}{\partial X} = \frac {\partial(tr(AX^{-1}))}{\partial (X^{-1})} \frac {\partial(X^{-1})}{\partial X}$$ which, using matrix cookbook, particularly equations (101) and (59), gives me $$\frac {\partial(tr(AX^{-1}))}{\partial (X^{-1})} \frac {\partial(X^{-1})}{\partial X} = A^TX^{-2}$$

This is the answer I get, but I'm not sure at all that that is how it works.

1

There are 1 best solutions below

1
On BEST ANSWER

Write the function in terms of the Frobenius (:) Inner Product and take its differential $$\eqalign{ f &= {\rm tr}(AX^{-1}) = A^T: X^{-1} \cr\cr df &= A^T: dX^{-1} \cr &= -A^T: X^{-1}\,dX\,X^{-1} \cr &= -X^{-T}A^TX^{-T}:dX \cr }$$ Since $df=\big(\frac{\partial f}{\partial X}:dX\big),\,$ the gradient must be $$\eqalign{ \frac{\partial f}{\partial X} &= -X^{-T}A^TX^{-T} \cr }$$