Differentiate with Kronecker product

305 Views Asked by At

Acutually, I have a function that : $$\operatorname{tr}(\mathbf{M}(\mathbf{B}\otimes\mathbf{A}))$$ where $M$ and $B$ are constant matrix while $A$ is my variable.

I want to have this : $$d \operatorname{tr}(\mathbf{M}(\mathbf{B}\otimes\mathbf{A})) = \operatorname{tr}(\mathbf{G}d\mathbf{A})$$

So, How to solve $\mathbf{G}$?

3

There are 3 best solutions below

0
On BEST ANSWER

Let's write your function in terms of the Frobenius (:) Inner Product and take its differential

\begin{equation} \begin{split} F &= \text{Tr}(M(B \otimes A)) \\ & = M^T:(B \otimes A) \\ \implies dF & = dM^T:(B \otimes A) + M^T:(dB \otimes A + B \otimes dA)\\ & = M^T:(B \otimes dA)\\ \end{split} \end{equation}

Here we need the Kronecker factorization of $M^T$

$$M^T = \sum_{j=1}^{r}C_j \otimes D_j$$

were $C_j, D_j$ matrices are shaped like $B$ and $A$ respectively.

We need also to apply the following property

$$(X \otimes Y):(Z \otimes W) = (X:Z)(Y:W)$$

Then

\begin{equation} \begin{split} dF & = \sum_{j=1}^{r}(C_j \otimes D_j):(B \otimes dA)\\ & = \sum_{j=1}^{r}(C_j : B)(D_j : dA)\\ & = \sum_{j=1}^{r}\text{Tr}(B^TC_j)D_j : dA\\ \frac{dF}{dA} & = \sum_{j=1}^{r}\text{Tr}(B^TC_j)D_j\\ \end{split} \end{equation}

0
On

Note that $A\mapsto f(A) :=\operatorname{tr}(\mathbf{M}(\mathbf{B}\otimes\mathbf{A}))$ is a linear mapping in $A$. Thus

$$ d (f (A)) = f (dA), $$ that is $$\mathbf F (dA) =M (B\otimes A).$$

0
On

$\def\l{\big(}\def\r{\big)}$Given matrix variables of the following dimensions $$\eqalign{ A\in{\mathbb R}^{m\times n} \qquad B\in{\mathbb R}^{q\times p} \qquad M\in{\mathbb R}^{pn\times qm} \\ }$$ Rearrange the function to free $A$ from the Kronecker product $$\eqalign{ \phi &= {\rm Tr}\l M(B\otimes A)\r \\ &= M^T:\l B\otimes A\r \\ &= M^T:\l B\otimes I_m\r\,\l I_p\otimes A\r \\ &= \l B\otimes I_m\r^TM^T:\l I_p\otimes A\r \\ &= \sum_{k=1}^p\;\l B\otimes I_m\r^TM^T:\l e_ke_k^T\otimes A\r \\ &= \sum_{k=1}^p\;\l B\otimes I_m\r^TM^T:\l e_k\otimes I_m\r\l I_{\tt1}\otimes A\r\l e_k\otimes I_n\r^T \\ &= \sum_{k=1}^p\;\l e_k\otimes I_m\r^T\l B\otimes I_m\r^TM^T\l e_k\otimes I_n\r:\l I_{\tt1}\otimes A\r \\ &= \sum_{k=1}^p\;\l Be_k\otimes I_m\r^TM^T\l e_k\otimes I_n\r:A \\ }$$ where $I_n$ denotes the $\,(n\times n)\,$ identity matrix and $e_k$ the $k^{th}$ column of $I_p$

In this new form, the gradient calculation is trivial $$\eqalign{ \frac{\partial \phi}{\partial A} &= \sum_{k=1}^p \l Be_k\otimes I_m\r^TM^T\l e_k\otimes I_n\r \\ }$$