How to compute the derivative of $\frac{\partial AB^T }{\partial{A}}$ and $\frac{\partial AB^T }{\partial{B}}$

166 Views Asked by At

How to compute the derivative of $$\frac{\partial AB^T }{\partial{A}}$$ and $$\frac{\partial AB^T }{\partial{B}}$$

where $A \in R^{m \times n}$ and $B \in R^{r \times n}$.

Also, how can we analysis the dimension of final result? such as $\frac{\partial AB^T }{\partial{A}}$ is a matrix or tensor?

Thanks a lot.

2

There are 2 best solutions below

0
On BEST ANSWER

Write the function using index notation (with the summation convention). $$\eqalign{ F_{ik} &= A_{ij} B_{jk}^T = A_{ij} B_{kj} \\ dF_{ik} &= dA_{ij}\,B_{kj} + A_{ij}\,dB_{kj} \\ }$$ Holding $B$ constant (i.e. $dB_{kj}=0$) yields the derivative with respect to $A$. $$\eqalign{ \frac{\partial F_{ik}}{\partial A_{pq}} &= (\delta_{ip}\delta_{jq})\,B_{kj} = \delta_{ip}\,B_{kq} \\ }$$ Similarly holding $A$ constant yields the derivative with respect to $B$. $$\eqalign{ \frac{\partial F_{ik}}{\partial B_{pq}} &= A_{ij}\,(\delta_{kp}\delta_{jq}) = A_{iq}\,\delta_{kp} \\ }$$ These derivatives require 4 indices for their description. So they are not matrices but 4th order tensors.

0
On

You can also use vectorizations,

Let

\begin{equation} \begin{split} C & = AB^T \\ dC & = (dA)B^T + A(dB^T) \\ vec(dC) & = \text{vec}(dAB^T) + \text{vec}(AdB^T) \\ & = (B \otimes I)\text{vec}(dA) + (I \otimes A)\text{vec}(dB^T) \\ & = (B \otimes I)\text{vec}(dA) + (I \otimes A)K\text{vec}(dB) \\ \end{split} \end{equation}

Then

\begin{equation} \begin{split} \frac{\text{vec}(dC)}{\text{vec}(dA)} & = (B \otimes I) \\ \frac{\text{vec}(dC)}{\text{vec}(dB)} & = (I \otimes A)K \\ \end{split} \end{equation}

where $K$ is the Kronecker commutation matrix.