I'm new here, so "Hi" to everyone :D
I got the following problem. I have the matrices $A$, $B$, $C$, $X$ and $Y$. All matrices are square (say n-by-n). In particular: - $A$ is full rank - $B$ is symmetric and (semi)definite positive; - $C$ is diagonal and definite positive; - $Y$ is diagonal and definite positive; - $X$ is diagonal ($X = \operatorname{diag}\{x_1, \ldots,x_n\}$) and it is the unknown matrix;
Then I have the following function: $f(X) = (A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}$ (it may seem dumb to write $X^{T}$ since it is diagonal, but I think this is the best way to write it).
I would like to evaluate the derivative of the trace of $f(X)$ with respect to each $x_i$.
Any idea?
If we perturb an invertible matrix $M$ by a small $\Delta M$, the first-order change in $M^{-1}$ is given by $\Delta (M^{-1}) := (M+\Delta M)^{-1} - M^{-1} = -M^{-1} (\Delta M) M^{-1} + O(\|\Delta M\|^2)$. Now, consider $f(X) = (A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}$. \begin{align} \Delta f(X) =&\Delta\left((A(B+X^{T}YX)^{-1}A^{T} + C)^{-1}\right)\\ \approx&-f(X)\ \Delta\left(A(B+X^{T}YX)^{-1}A^{T} + C\right) f(X)\\ =&-f(X)A \Delta\left((B+X^{T}YX)^{-1}\right) A^{T}f(X)\\ \approx&f(X)A(B+X^{T}YX)^{-1} \Delta\left(B+X^{T}YX\right) (B+X^{T}YX)^{-1}A^{T}f(X)\\ \approx&f(X)A(B+X^{T}YX)^{-1} \left((\Delta X)^{T}YX + X^TY\Delta X\right) (B+X^{T}YX)^{-1}A^{T}f(X). \end{align} Therefore \begin{align} \Delta\, \mathrm{trace}f(X) \approx&\mathrm{trace}\, f(X)A(B+X^{T}YX)^{-1} \left((\Delta X)^{T}YX + X^TY\Delta X\right) (B+X^{T}YX)^{-1}A^{T}f(X)\\ =&2\,\mathrm{trace}\, (\Delta X)^{T}YX (B+X^{T}YX)^{-1}A^{T}f(X)^2A(B+X^{T}YX)^{-1} \end{align} and in turn $$ \frac{d\mathrm{trace}f(X)}{dX} = 2YX (B+X^{T}YX)^{-1}A^{T}f(X)^2A(B+X^{T}YX)^{-1}. $$ This is the formula for a general square matrix $X$. For a diagonal $X$, simply take the diagonal of the above derivative.