I am looking for a rule giving a general solution for
$$\frac{\partial}{\partial X} a^T X^{-1} b$$
with $X$ a square matrix. Moreover, I would be interested how this rule differs if $X$ is symmetric.
From matrix cookbook I know
$$\frac{\partial}{\partial X} a^T X b = ab^T$$
as well as
$$\frac{\partial}{\partial X} a^T X^T b = b a^T$$
but I am not sure how these results may be of help here.
Specifically, this result is needed when taking the derivative of a multivariate normal distribution with respect to $\Sigma$ in
$$\frac{\partial}{\partial \Sigma} (2\pi)^{-D/2} \det (\Sigma)^{-1/2} \exp \big(-\frac{1}{2}(x-\mu)^T \Sigma^{-1} (x-\mu) \big)$$
Use the Frobenius (:) Inner Product to rewrite the function, then find its differential and gradient $$\eqalign{ f &= ab^T:X^{-1} \cr\cr df &= ab^T:dX^{-1} \cr &= -ab^T:X^{-1}\,dX\,X^{-1} \cr &= -X^{-T}ab^TX^{-T}:dX \cr\cr \frac{\partial f}{\partial X} &= -X^{-T}ab^TX^{-T} \cr\cr }$$ The only tricky bit is knowing that $$dX^{-1}=-X^{-1}\,dX\,X^{-1}$$ which can be derived by considering the differential of $\,\,I=X^{-1}X$
This result is for the general case. If you impose a symmetry constraint on $X$ then you should modify the result to $${\rm csym}\Big(\frac{\partial f}{\partial X}\Big)$$ where $${\rm csym}(A)=A+A^T-{\rm diag}(A)$$ For details on why this particular symmetrization is needed, see the "Structured Matrices" section in the Matrix Cookbook.