derivative of gradient involving inverse of matrices

284 Views Asked by At

I need to take three partial derivatives of this squared mahanalobis distance with respect to these three matrices: $Q, A,$ and $S$

$$(x+Ab)^T(A^TQA+S)^{-1}(x + Ab)$$

$x$ and $b$ are vectors of equal length and $A,Q,S$ are arbitrary full rank square matrices, each of equal size. Is there a reference or some general way to approach this?

1

There are 1 best solutions below

0
On BEST ANSWER

The way I'd proceed is to instead calculate the differential, using the identity that $d(A^{-1}) = -A^{-1}dAA^{-1}.$ The chain and product rules are in full force and allow you to take the derivative more or less by rote.

So for instance calling your expression $B$, and holding everything except $Q$ constant:

\begin{align*} dB &= (x+Ab)^Td(A^TQA+S)^{-1}(x+Ab)\\ &= -(x+Ab)^T(A^TQA+S)^{-1}d(A^TQA+S)(A^TQA+S)^{-1}(x+Ab)\\\\ &= -(x+Ab)^T(A^TQA+S)^{-1}A^TdQA(A^TQA+S)^{-1}(x+Ab) \end{align*}

and then if you need an explicit expression for $\frac{\partial B}{\partial q_{ij}}$ in coordinates you can plug in $dQ = e_i \otimes e_j$.