Derivative involving vec() operator

613 Views Asked by At

Let $X$ be a $m\times n$ matrix, $B$ be a $n\times n$ matrix and $D$ be a $n\times n$ diagonal matrix with diagonal elements $d_i$. Is it possible to write down the derivative of the following expression wrt the vector $d$ (or matrix $D$)?

$$ \text{vec}(X(B+D)^{-1})^T \text{vec}(X(B+D)^{-1}) $$

1

There are 1 best solutions below

2
On BEST ANSWER

Define a diag() function which returns the main diagonal of its matrix argument as a column vector, and a Diag() function which generates a diagonal matrix from a vector argument.

Then define the following matrices $$\eqalign{ A &= D = {\rm Diag}(a) \\ C &= A+B &\implies dC=dA={\rm Diag}(da) \\ Y &= XC^{-1} &\implies dY=-XC^{-1}\,dC\,C^{-1} \\ }$$ Write the function in terms of the new matrices $$\eqalign{ \phi &= {\rm vec}(Y)^T\,{\rm vec}(Y) = Y:Y \\ d\phi &= 2Y:dY \\ &= -2Y:\left(XC^{-1}\,dC\,C^{-1}\right) \\ &= -2\left(C^{-T}X^TYC^{-T}\right):dC \\ &= -2\left(C^{-T}X^TYC^{-T}\right):{\rm Diag}(da) \\ &= -2\operatorname{diag}\left(C^{-T}X^TYC^{-T}\right):da \\ \frac{\partial\phi}{\partial a} &= -2\operatorname{diag}\left(C^{-T}X^TYC^{-T}\right) \\ &= -2\operatorname{diag}\Big((A+B)^{-T}X^TX(A+B)^{-1}(A+B)^{-T}\Big) \\\\ }$$


In some of the steps above, a colon is used to represent the trace/Frobenius product, i.e. $$\eqalign{ A:B = {\rm Tr}(A^TB) }$$ The cyclic property of the trace allows the terms in such products to be rearranged in a number of ways, e.g. $$\eqalign{ A:B &= A^T:B^T &= B:A \\ A:BC &= AC^T:B &= B^TA:C \\ }$$ The variable name $D$ was replaced with $A$ because visually, $da$ is easily identified as the differential of the vector $a$, while $dd$ looks ambiguous.