Derivative of trace of inverse matrix?

8.1k Views Asked by At

I've been trying to derive the formula for the derivative of $Tr(X^{-1})$ w.r.t. $X$, which I know is $X^{-2T}$. According to the Matrix Cookbook $$\dfrac{\partial g(U)}{\partial X_{ij}} = \operatorname{Tr}\left[\left(\dfrac{\partial g(U)}{\partial U}\right)^T \dfrac{\partial U}{\partial X_{ij}}\right]$$ So, if we let $U = X^{-1}$ and $g(X) = \operatorname{Tr}[X]$, we get $$\dfrac{\partial \,\operatorname{Tr}(X^{-1})}{\partial X_{ij}} = \operatorname{Tr}\left[\left(\dfrac{\partial \,\operatorname{Tr}(U)}{\partial U}\right)^T \dfrac{\partial X^{-1}}{\partial X_{ij}}\right] = \operatorname{Tr}\left[ \dfrac{\partial X^{-1}}{\partial X_{ij}}\right]$$ since the derivative of the trace of a matrix w.r.t. that matrix is just the identity. However, I'm not sure how to proceed from here. I know that the derivative of the inverse of a matrix w.r.t. that matrix is $-X^{-2}$, but I don't know what the derivative of the inverse w.r.t. a specific entry would be, nor do I know what the trace of that derivative matrix would be. Is there something I'm missing here?

1

There are 1 best solutions below

5
On BEST ANSWER

These derivatives for matrix functions are better handled as directional derivatives. Denote $h(X)=\text{tr}(X^{-1})$. We have: $$ dh(X)(U)=\lim_{t\to 0}\frac{h(X+tU)-h(X)}{t}=\lim_{t\to 0}\frac{\text{tr}((X+tU)^{-1})-\text{tr}(X^{-1})}{t} $$ $$ =\text{tr}\Big(\lim_{t\to 0}\frac{(X+tU)^{-1}-(X^{-1})}{t}\Big) =\text{tr}\Big(\lim_{t\to 0}(X+tU)^{-1}\frac{X-(X+tU)}{t}X^{-1}\Big) $$ $$ =\text{tr}\Big(\lim_{t\to 0}(X+tU)^{-1}(-U)X^{-1}\Big)=-\text{tr}(X^{-1}UX^{-1}). $$

If one wants to use the formulae in the Matrix Cookbook (that follow all using directional derivatives as above), then use the chain rule in the form $\partial g(X)=\partial g(\partial X)$. In our case: $$ \partial(\text{tr}(X^{-1})=\text{tr}(\partial(X^{-1}))=\text{tr}(-X^{-1}(\partial X)X^{-1})= -\text{tr}(X^{-1}(\partial X)X^{-1}) $$ (derivative of the trace is the trace itself).