Matrix Derivative d(AXA)^(-1)/dX

538 Views Asked by At

I am having trouble figuring out the following matrix derivative $\frac{\partial(B X A')(AX A')^{-1}}{\partial X}$, where $X$ is square $n\times n$, A is $m\times n$, with $m<n$. and B is dimension $l\times N$.

I started with $\frac{\partial BX A'}{\partial X}(AX A')^{-1}+ (BX A')\frac{\partial (A XA)^{-1}}{\partial X}$, but I can't properly figure out the second term.

3

There are 3 best solutions below

1
On

The desired matrix derivative has $n^2$ matrix elements of dimensions $\ell\times m$ given by $$\begin{eqnarray}\frac{\partial(BXA')(A XA')^{-1}}{\partial x_{ij}}& =&\frac{\partial(BXA')}{\partial x_{ij}}(A XA')^{-1}+(BXA')\frac{\partial(A XA')^{-1}}{\partial x_{ij}}\\& = &\frac{\partial(BXA')}{\partial x_{ij}}(A XA')^{-1}-(BXA')(A XA')^{-1}\frac{\partial(A XA')}{\partial x_{ij}}(A XA')^{-1}\end{eqnarray}$$ with $X=[x_{ij}]_{i,j=1}^n$. Note that the matrix derivative will have dimensions $n\ell\times nm$ so your development is not correct.

Using Kronecker product notation and the $vec$ operator we can write the complete solution as follows $$\begin{align}\frac{\partial(BXA')(A XA')^{-1}}{\partial X} & = \frac{\partial(BXA')}{\partial X}\left[\mathbb{I}_n\otimes (A XA')^{-1}\right]\\&\qquad -\left[\mathbb{I}_n\otimes (BXA')(A XA')^{-1}\right]\frac{\partial(A XA')}{\partial X}\left[\mathbb{I}_n\otimes(A XA')^{-1}\right]\\ & = (vec B')(vec A')'\left[\mathbb{I}_n\otimes (A XA')^{-1}\right]\\ &\qquad-\left[\mathbb{I}_n\otimes (BXA')(A XA')^{-1}\right](vec A')(vec A')'\left[\mathbb{I}_n\otimes(A XA')^{-1}\right]\end{align}$$

0
On

For convenience, define $M=(AXA^T)^{-1}$, then the function can be expressed as $$ F = BXA^TM$$ First, let's find the differentials $$\eqalign{ dM &= -M\,\,d(A\,X\,A^T)\,\,M \cr &= -MA\,\,dX\,A^TM \cr\cr dF &= B\,dX\,A^TM + BXA^T\,dM \cr &= B\,dX\,A^TM - BXA^T\,MA\,\,dX\,A^TM \cr &= (B-BXA^TMA)\,\,dX\,A^TM \cr }$$ Dealing with matrix-by-matrix derivatives is hard. About the only thing you can do is to vectorize both sides of the equation using the rule $\,\,\,{\rm vec}(AXB)=(B^T\otimes A)\,{\rm vec}(X)\,\,\,$ which yields $$\eqalign{ df &= \big((A^TM)^T\otimes(B-BXA^TMA)\big)\,dx \cr\cr \frac {\partial f} {\partial x^T} &= (M^TA)\otimes(B-BXA^TMA) \cr\cr \frac {\partial\,{\rm vec}(F)} {\partial\,{\rm vec}(X)^T} &= (M^TA)\otimes(B-BXA^TMA) \cr }$$

0
On

The derivative is a linear application. Let $f(X)=BXA^T(AXA^T)^{-1}$ where $rank(A)=m$. It suffices to derive a product: $Df_X:H\in M_n\rightarrow BHA^T(AXA^T)^{-1}-(BXA^T)(AXA^T)^{-1}(AHA^T)(AXA^T)^{-1}$. To write the tensor form seems to me useless.