Frobenius norm involving Kronecker Product

614 Views Asked by At

Consider $ J = ||\mathbf{G} - ( \mathbf{B} \otimes \mathbf{X} )||_F^2 $, where $\mathbf{G}$ and $\mathbf{B}$ are complex matrices, and $||.||_F$ is the Frobenius norm. Find the derivative with respect to $ \mathbf{X} $

Note: My question is related to this post: Derivative of a trace with second order Kronecker product. I would like a solution that does not involve SVD decomposition. Any help/hints on how to solve this problem are welcomed.

2

There are 2 best solutions below

8
On BEST ANSWER

Let $\langle \cdot , \cdot \rangle$ denote the Frobenius inner product $\langle A,B \rangle = A:B = \operatorname{Tr}(AB^T)$. One approach is to expand $J(X + H)$, then extract the total derivative. In this case, we have $$ J(X + H) = \langle G - (B \otimes (X + H)), G - (B \otimes (X+ H)) \rangle\\ = \langle (G - (B \otimes X)) - B \otimes H, (G - (B \otimes X)) - B \otimes H \rangle\\ = J(X) - 2 \operatorname{Re} \langle G - (B \otimes X),B \otimes H \rangle + o(H). $$ So, the derivative of $J$ with respect to $X$ is $$ J'(X)(H) = - 2 \operatorname{Re} \langle G - (B \otimes X),B \otimes H \rangle. $$ The trick, however, is to extract the matrix form of this derivative. For the numerator-layout derivative, we're looking for a matrix $\frac{\partial J}{\partial X} = M$ (that depends on $X$) for which $\langle M,H \rangle = J'(X)(H)$.

The post you linked explains how this matrix can be found using SVD. For another approach, we can use the fact that $M_{ij} = J'(X)(E_{ij})$, where $E_{ij}$ denotes the matrix with a $1$ as the $E_{ij}$ entry and zeros elsewhere. We therefore have $$ M_{ij} = - 2 \operatorname{Re} \langle G - (B \otimes X),B \otimes E_{ij} \rangle. $$ We can make a bit more sense out of this if we break $G$ into a sum. If $X$ has size $m \times n$, then we can write $$ G = \sum_{i=1}^m \sum_{j=1}^n G_{ij} \otimes E_{ij}, $$ where each $G_{ij}$ has the same size as $B$ (note that $G_{ij}$ is actually a submatrix of $G$). With that, we have $$ M_{ij} = - 2 \operatorname{Re} \left\langle \sum_{i=1}^m \sum_{j=1}^n G_{ij} \otimes E_{ij} - (B \otimes X),B \otimes E_{ij} \right\rangle\\ = - 2 \sum_{i=1}^m \sum_{j=1}^n\operatorname{Re} \left\langle G_{ij} \otimes E_{ij} - (B \otimes X),B \otimes E_{ij} \right\rangle\\ = - 2 \sum_{i=1}^m \sum_{j=1}^n \operatorname{Re}[\langle G_{ij},B \rangle - \langle B,B\rangle x_{ij}]. $$

0
On

First, recall that the component-wise self-gradient of a matrix is given by $$\eqalign{ \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \grad{X}{X_{ij}} &= E_{ij} \\ }$$ where $E_{ij}$ is the matrix whose $(i,j)$ element is equal to one and all others are equal to zero.

Then, for typing convenience, define the matrix variable $$\eqalign{ A &= (B\otimes X)-G \\ }$$ and the Frobenius product $$\eqalign{ \def\Si{\sum_{i=1}^m} \def\Sj{\sum_{j=1}^n} P:Q &= \Si\Sj P_{ij}Q_{ij} \;=\; {\rm Tr}(P^TQ) \\ P^*:P &= \left\| P \right\|_F^2 \\ }$$


Use the above notation to write the objective function,
then calculate its component-wise Wirtinger gradients $$\eqalign{ \def\J{{\cal J}} \J &= \left\| A \right\|_F^2 \\ &= A^*:A \\ d\J &= A^*:dA \\ &= A^*:(B\otimes dX) \\ \grad{\J}{X_{ij}} &= A^*:(B\otimes E_{ij}) \\ }$$ To construct the full matrix-wise gradient, sum these components with the standard matrix basis (which happens to be the previously defined $E_{ij}$ matrices) $$\eqalign{ \grad{\J}{X} &= \Si\Sj\left(\grad{\J}{X_{ij}}\right)E_{ij} \\ }$$ Of course, gradients with respect to the complex conjugates also exist $$\eqalign{ \grad{\J}{X_{ij}^*} &= \left(\grad{\J}{X_{ij}}\right)^* \quad\implies\quad \grad{\J}{X^*} &= \left(\grad{\J}{X}\right)^* \\ }$$