What is the rule for taking the matrix derivative of an inverse?

156 Views Asked by Bumbble Comm At 02 Apr 2026 - 4:30

I am looking for a rule giving a general solution for

$$\frac{\partial}{\partial X} a^T X^{-1} b$$

with $X$ a square matrix. Moreover, I would be interested how this rule differs if $X$ is symmetric.

From matrix cookbook I know

$$\frac{\partial}{\partial X} a^T X b = ab^T$$

as well as

$$\frac{\partial}{\partial X} a^T X^T b = b a^T$$

but I am not sure how these results may be of help here.

Specifically, this result is needed when taking the derivative of a multivariate normal distribution with respect to $\Sigma$ in

$$\frac{\partial}{\partial \Sigma} (2\pi)^{-D/2} \det (\Sigma)^{-1/2} \exp \big(-\frac{1}{2}(x-\mu)^T \Sigma^{-1} (x-\mu) \big)$$

Original Q&A

There are 1 best solutions below

Bumbble Comm On 14 Oct 2016 - 3:49 BEST ANSWER

Use the Frobenius (:) Inner Product to rewrite the function, then find its differential and gradient $$\eqalign{ f &= ab^T:X^{-1} \cr\cr df &= ab^T:dX^{-1} \cr &= -ab^T:X^{-1}\,dX\,X^{-1} \cr &= -X^{-T}ab^TX^{-T}:dX \cr\cr \frac{\partial f}{\partial X} &= -X^{-T}ab^TX^{-T} \cr\cr }$$ The only tricky bit is knowing that $$dX^{-1}=-X^{-1}\,dX\,X^{-1}$$ which can be derived by considering the differential of $\,\,I=X^{-1}X$

This result is for the general case. If you impose a symmetry constraint on $X$ then you should modify the result to $${\rm csym}\Big(\frac{\partial f}{\partial X}\Big)$$ where $${\rm csym}(A)=A+A^T-{\rm diag}(A)$$ For details on why this particular symmetrization is needed, see the "Structured Matrices" section in the Matrix Cookbook.

What is the rule for taking the matrix derivative of an inverse?

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in DERIVATIVES

Related Questions in NORMAL-DISTRIBUTION

Related Questions in INVERSE

Trending Questions

Popular # Hahtags

Popular Questions