I'm trying to differentiate an inverse matrix in a form $ { \partial x^TA^{-1}x \over \partial A }$.
I found the answer from the matrix cookbook as follows.
$ { \partial x^TA^{-1}x \over \partial A } = -A^{-T}xx^{T}A^{-T} \tag{1} $
However, I do not understand how can I derive the answer.
I attempted as follows.
- (edited) thanks to user7530. $x$ is a vector and $x^{-1}$ sounds very wrong!
$ (x^{T}A^{-1}x)(x^{-1}Ax^{-T}) = I \tag{2} $
$ { \partial (x^{T}A^{-1}x)(x^{-1}Ax^{-T}) \over \partial A} = {\partial I \over \partial A} \tag{3} $
$ { \partial (x^{T}A^{-1}x) \over \partial A} (x^{-1}Ax^{-T}) + (x^{T}A^{-1}x) {\partial (x^{-1}Ax^{-T}) \over \partial A} = 0 \tag{4} $
$ { \partial (x^{T}A^{-1}x) \over \partial A} (x^{-1}Ax^{-T}) = - (x^{T}A^{-1}x) {\partial (x^{-1}Ax^{-T}) \over \partial A} \tag{5} $
$ { \partial (x^{T}A^{-1}x) \over \partial A} = - (x^{T}A^{-1}x) {\partial (x^{-1}Ax^{T}) \over \partial A} (x^{T}A^{-1}x) \tag{6} $
but I'm stuck and I feel like I have done something wrong.
But the problem is that I have no idea how should I continue to obtain the answer nor where did I go wrong.
I'll be glad if anyone can show me the proof or guide me.
Use Neumann's series with a direct directional derivative approach: let $V$ be a generic matrix, $A$ invertible matrix and $\epsilon$ a scalar parameter to send to zero. The difference quotient is $$ \frac1\epsilon((A+\epsilon V)^{-1}-A^{-1}) = A^{-1}\frac{(I+\epsilon VA^{-1})^{-1}-I}\epsilon $$ Now take $B=\epsilon VA^{-1}$ and recall Neumann's series (which converges absolutely for $\epsilon$ small enough) $$ (I+B)^{-1}=I-B+B^2-B^3+\dotsc $$ Work out the algebra, isolate the zero-order term, and take $\epsilon\to0$ you'll get the derivative of the inverse at $A$ in the direction $V$ $$ \partial_A{A^{-1}}V=-A^{-1}VA^{-1}. $$ Check that this coincides with the scalar formula for derivative of $1/x$ at a. For your formula you'll need the chain rule which should yield $$ \partial_A({x^*A^{-1}x})V=-x^*A^{-1}VA^{-1}x. $$ Note that you cannot write your derivative as a single matrix (which you seem to be trying to do). It's two matrices one acting on the left the other on the right of the increment $V$, which is still a linear operator (on $V$). But it is a fact of life that square-matrix multiplication is not commutative and not all linear operators on the space of $n$-square-matrices can be represented as a single $n$-square-matrix of the same dimension (if you think about it the space of linear operators on $n$-square-matrices is a $n^2\times n^2=n^4$ dimensional vector space with as many degrees of freedom).