Assume $X \in \mathbb{R^{n \times n}}$. I could not found particular formula to calculate the Derivative of $X^{-1}$ with respect to $X$, but I found a formula related to inverse of matrix as follows:
(1)$\frac{\partial}{\partial X} (a^TX^{-1}b) = -X^{-T}ab^TX^{-T} \quad a, b \in \mathbb{R}^n$
Can anyone give an insight on how derive a formula for derivative of $X^{-1}$ or formula (1) please?
Thank you in advance.
======
This post shared and discussed the same topic and I was asked if the current post is redundant. I believe the way the problem stated and discussed in these two posts is different. Specifically, I was trying to learn a simple approach for finding the derivative of a matrix expression that contains inverse of a matrix. I believe the detailed answer and discussion in the post is helpful to other learners like me(with an elementary calculus and matrix understanding).
In this case, I imagine you want the matrix derivative of the above expression. As such, let $X(t)$ be an invertible matrix on some neighbourhood of $0$, then
$$ X^{-1}(t)X(t) = I \implies \frac{\partial X^{-1}(t)}{\partial t}X(t) + X^{-1}(t)\frac{\partial X(t)}{\partial t} = 0 $$
rearranging and multiplying on the right by the inverse yields $$ \frac{\partial X^{-1}(t)}{\partial t} = -X^{-1}(t)\frac{\partial X(t)}{\partial t} X^{-1}(t). $$
This is probably the derivative you were looking for originally. Anyways, continuing to show (1) is straightforward now,
$$ \frac{\partial a^T X^{-1}b}{\partial t} = a^T\frac{\partial X^{-1}(t)}{\partial t}b = -a^T X^{-1}(t)\frac{\partial X(t)}{\partial t} X^{-1}(t) b $$
Assuming $X(t) = X + tY$, and evaluating at $t=0$ yields
$$ \frac{\partial a^T X^{-1}(t)b}{\partial t}\bigg|_{t=0} = a^T\frac{\partial X^{-1}(t)}{\partial t}\bigg|_{t=0}b =-a^T X^{-1} Y X^{-1} b $$ which, after some rearranging such that the above acts on general $Y$, gives your solution.
I guess I should probably just complete the solution. We usually define, for a differentiable function $F:\mathbb{R}^{m\times m} \to \mathbb{R}$, and $e_{ij} = e_ie_j^T$ where $e_i$ are the standard basis,
$$ \left(\frac{\partial F(A)}{\partial X}\right)_{ij} \equiv \frac{\partial F(A+te_{ij})}{\partial t}\bigg|_{t=0} $$
Note that this is equivalent to taking component-wise derivatives over $X$ when evaluated at a 'point' [i.e. matrix, as given] $M$.
Now, using this, then the above derivative becomes $$ \left(\frac{\partial a^T X^{-1}b}{\partial X}\right)_{ij} = -a^T X^{-1} e_{ij} X^{-1} b $$
or, writing out the multiplication explicitly using kronecker deltas---$\delta_{ij} =1$ when $i=j$ and 0 otherwise---and using Einstein summation convention (e.g. repeated indices are implicitly summed) we get
$$ \begin{align} \left(\frac{\partial a^T X^{-1}b}{\partial X}\right)_{ij} &= -\left(a^T X^{-1}\right)_{k} \delta_{ik}\delta_{j\ell} (X^{-1} b)_{\ell} \\ &= -\left(a^T X^{-1}\right)_{i}(X^{-1} b)_{j} \\ &= -\left(\left(a^T X^{-1}\right)^T(X^{-1} b)^T\right)_{ij}\\ &= -\left(X^{-T}ab^TX^{-T}\right)_{ij} \end{align} $$
as we wished.