Inverse Matrix Differential

104 Views Asked by At

Suppose that we are given $g(\Sigma)=\Sigma^{-1}(\mu_0-\mu_1)$, where $\Sigma$ is p by p, and both $\mu_0$ and $\mu_1$ are p by 1.

Now I am hoping to find $\frac{dg}{d\Sigma}$. My current work is using the idea of differential:

$$ dg = (d\Sigma^{-1})(\mu_0-\mu_1) = (-\Sigma^{-1}(d\Sigma)\Sigma^{-1})(\mu_0-\mu_1) $$

and I am stuck.

I would really appreciate it if you may kindly show me the working steps.

p.s.:I believe the final result should be of dimension $\frac{1}{2}p(p+1)$ by $p$.

1

There are 1 best solutions below

3
On BEST ANSWER

You are attempting to find the derivative (gradient) of a vector with respect to a matrix. The result will not be a matrix, but a 3rd order tensor!

One way to proceed is to use vectorization (aka column stacking).

And for ease of typing define: $\,\,S=\Sigma,\,\,\,W=S^{-1},\,\,\,s={\rm vec}(S),\,\,\,y=(\mu_1-\mu_0)$

Since your differential is correct, let's start from there. $$\eqalign{ {\rm vec}(dg) &= {\rm vec}\Big(W\,dS\,Wy\Big) \cr dg &= \Big((Wy)^T\otimes W\Big)\,ds \cr \frac{\partial g}{\partial s} &= (Wy)^T\otimes W \cr }$$ Another way to proceed is to use higher-order tensors, or else index notation.

In latter notation, the 3rd-order tensor gradient can be written as $$\eqalign{ G_{ijk} &= \frac{\partial g_{i}}{\partial S_{jk}} = W_{ik}W_{jn}y_{n} \cr }$$ where the repeated index $(n)$ in the final terms implies a summation over that index.