Derivative of a scalar function with resepct to a Matrix

132 Views Asked by At

I need help with the following differentiation

$$ \text{trace}((aI+bXX^T)^{-1}(aI+XX^T)) $$

with respect to $X$, where $a,b$ are some positive constants, and $I$ is the identity matrix.

Thank you

1

There are 1 best solutions below

1
On BEST ANSWER

Let $Y=aI+bXX^T$ and $Z=aI+XX^T$. Then \begin{align*} d\operatorname{tr}(Y^{-1}Z) &= \operatorname{tr}\left(dY^{-1}Z + Y^{-1}dZ\right)\\ &= \operatorname{tr}\left(-Y^{-1}dYY^{-1}Z + Y^{-1}dZ\right)\\ &= \operatorname{tr}\left(-b\,Y^{-1}(dXX^T+XdX^T)Y^{-1}Z + Y^{-1}(dXX^T+XdX^T)\right). \end{align*} Using the properties that $\operatorname{tr}(AB)=\operatorname(BA)$ and $\operatorname{tr}(M)=\operatorname{tr}(M^T)$, the last line in the above can be rewritten as $2\operatorname{tr}\left[dX^T\left(-b\,Y^{-1}ZY^{-1} + Y^{-1}\right)X\right]$. Hence the required derivative is $$2\left(-b\,Y^{-1}ZY^{-1} + Y^{-1}\right)X$$ if the so-called "denominator layout" is used.