What is $\frac{\partial x^TA^{-1}y}{\partial A}$?

244 Views Asked by At

I had trouble proving the following:

If $ A\in \mathbb R^{n\times n}$ and $A$ is nonsingular, $x \in \mathbb R^{n\times 1}$, $y \in \mathbb R^{n\times 1}$, then $ \dfrac{\partial x^TA^{-1}y}{\partial A} = -A^{-T}xy^TA^{-T}$, (in which $A^{-T} = (A^{-1})^T$

The "Matrix Cookbook" simply mentions that the above formula could be derived from the basic identity $ \dfrac{\partial Y^{-1}}{\partial x} = -Y^{-1}\dfrac{\partial Y}{\partial x}Y^{-1}$. But how exactly does it work?

I'm stuck on this problem for quite a while. Any help would be much appreciated :)

1

There are 1 best solutions below

1
On BEST ANSWER

Well, there's actually a step in between in the Matrix Cookbook, which makes this a lot clearer:

$$ \frac{\partial\left(\mathbf X^{-1}\right)_{kl}}{\partial X_{ij}}=-\left(\mathbf X^{-1}\right)_{ki}\left(\mathbf X^{-1}\right)_{jl} $$

This you can get from the relationship that you quoted by setting $x=X_{ij}$:

\begin{align} &\frac{\partial\left(\mathbf X^{-1}\right)_{kl}}{\partial X_{ij}}\\ =&-\left(\mathbf X^{-1}\frac{\partial\mathbf X}{\partial X_{ij}}\mathbf X^{-1}\right)_{kl}\\ =&-\sum_{\rho.\sigma}\left(\mathbf X^{-1}\right)_{k\rho}\left(\frac{\partial\mathbf X}{\partial X_{ij}}\right)_{\rho\sigma}\left(\mathbf X^{-1}\right)_{\sigma l}\\ =&-\sum_{\rho.\sigma}\left(\mathbf X^{-1}\right)_{k\rho}\delta_{i\rho}\delta_{j\sigma}\left(\mathbf X^{-1}\right)_{\sigma l}\\ =&-\left(\mathbf X^{-1}\right)_{ki}\left(\mathbf X^{-1}\right)_{jl}\;. \end{align}

Now the relationship that you want can be derived like this:

\begin{align} \frac{\partial x^\top A^{-1}y}{\partial A_{ij}} &=x^\top\frac{\partial A^{-1}}{\partial A_{ij}}y\\ &=\sum_{k,l}\left(x^\top\right)_k\frac{\partial\left(A^{-1}\right)_{kl}}{\partial A_{ij}}y_l\\ &=-\sum_{k,l}\left(x^\top\right)_k\left(A^{-1}\right)_{ki}\left(A^{-1}\right)_{jl}y_l\\ &=-\left(A^{-\top}x\right)_i\left(A^{-1}y\right)_j\;, \end{align}

so

$$ \frac{\partial x^\top A^{-1}y}{\partial A}=-A^{-\top}xy^\top A^{-\top}\;. $$