Derivitavie of this matrix: $\frac{d\boldsymbol{Y}}{d\boldsymbol{X}}$ of $\boldsymbol{Y}=\boldsymbol{AXB}$

114 Views Asked by At

Well, the title is self explanatory. I want to find a closed solution of the matrix derivative. Here are the sizes of the matrices:

$[\boldsymbol{A}] = M\times N$

$[\boldsymbol{X}] = N\times P$

$[\boldsymbol{B}] = P\times Q$

$[\boldsymbol{Y}] = M\times Q$

I want to find the derivative with respect to $X$ given that,

$\boldsymbol{Y}=\boldsymbol{AXB}$

I know that if $\boldsymbol{A}$ & $\boldsymbol{B}$ are 1D matrices, that is $M=1$ and $Q=1$ then the derivative of $Y$ with respect to $\boldsymbol{X}$ is $\boldsymbol{A}^T\boldsymbol{B}^T$. But this is not a general result and that ($\boldsymbol{A}^T\boldsymbol{B}^T$) multiplication is invalid for $2D$ matrices.

Can anybody shade some light on the matter and/or provide some good reading materials?

Thanks in advance.

2

There are 2 best solutions below

0
On BEST ANSWER

Applying the vec operator to your equation, reduces it to a standard matrix-vector form $$\eqalign{ {\rm vec}(Y) &= {\rm vec}(AXB) \cr y &= (B^T\otimes A)\,{\rm vec}(X) \cr &= (B^T\otimes A)\,x \cr }$$ whose gradient is well-known $$\eqalign{ \frac{\partial y}{\partial x} &= B^T\otimes A \cr }$$ If instead, you wanted an expression for the matrix-matrix derivative, which is a 4th order tensor, then standard matrix notation is inadequate, and you must resort to index notation $$\eqalign{ Y_{ij} &= A_{ip}\,X_{pr}\,B_{rj} \cr\cr dY_{ij} &= A_{ip}\,dX_{pr}\,B_{rj} \cr\cr \frac{\partial Y_{ij}}{\partial X_{km}} &= A_{ip}\,\bigg(\frac{\partial X_{pr}}{\partial X_{km}}\bigg)\,B_{rj} \cr &= A_{ip}\,\delta_{pk}\,\delta_{mr}\,B_{rj} \cr &= A_{ik}\,B_{mj} \cr }$$

0
On

Well, $$dY = A(dX)B$$

so computing the answer is trivial (the derivative is the operator that sends $d$ to $AdB$), but there may be a psychological difficulty in imagining what "derivative" means here. In this case it is not a number or a matrix, but a linear operator from matrices to matrices.