How to arrive at $ \frac{\partial }{\partial X} \log(\mathrm{det}(I + A X B^\top )) = A^\top (I +B X^\top A^\top )^{-1} B$?

223 Views Asked by At

Dear Matrix Calculus experts, please do enlighten me. How to arrive at the following matrix derivative?

$$\frac{\partial }{\partial X} \log( \mathrm{det}(I + A X B^\top ) ) = A^\top (I + B X^\top A^\top )^{-1} B$$

I am in a confused state with matrix calculus because I am learning and not fully grasping the concept, I must admit. In Wikipedia, it says the following:

It is often easier to work in differential form and then convert back to normal derivatives.

My source of confusion begins from here. So, the differential is

$$\mathrm d \log(\det(X)) = \mbox{tr} \left( X^{-1} \mathrm d X \right)$$

now do we "convert" it to normal derivative such that I get my above answer? :/ I hope I will get this matrix calculus someday. Thank you so much in advance.

1

There are 1 best solutions below

4
On BEST ANSWER

Rather than the trace function, the trace/Frobenius product notation $$A:B={\rm tr}(A^TB)$$ lends itself more readily to algebraic manipulations.

Start with the differential result that you quoted, but in terms of the variable $$Y=(I+AXB^T)$$ instead of $X$. Then change the variable from $Y\rightarrow X$ $$\eqalign{ d\log\det Y &= Y^{-T}:dY\cr &= Y^{-T}:A\,dX\,B^T \cr &= A^TY^{-T}B:dX \cr &= A^T(I+BX^TA^T)^{-1}B:dX\cr }$$ The gradient is therefore $$\eqalign{ \frac{\partial\log\det Y}{\partial X} &= A^T(I+BX^TA^T)^{-1}B \cr\cr }$$ Note that the cyclic property of the trace function gives rise to many equivalent ways to arrange the terms in a Frobenius product, e.g. $$\eqalign{ A:BC &= BC:A \cr &= A^T:(BC)^T \cr &= AC^T:B \cr &= B^TA:C \cr &= B^TAC^T:I \cr &= etc \cr }$$