Vector - Matrix Differentiation that includes the Kronecker product

198 Views Asked by At

I have that ${y}={A}\otimes{A}{x}$ where ${A}\in\mathbb{R}^{n\times n}$ and ${x}\in\mathbb{R}^{n^2}$. I want to find $\frac{d{y}}{d{A}}$ in matrix (or tensor) form. I have looked at other questions on here where the solution uses the Magnus-Neudecker technique of vectorising each side. The issue is that my term already contains a kronecker product so the identity $\text{vec}(ABC)=(C^{\mathrm {T} }\otimes A)\operatorname {vec} (B)$ that is used frequently in the various solutions isn't useful in this case. Any help would be much appreciated.

2

There are 2 best solutions below

2
On BEST ANSWER

If you "unvectorize" the vectors $x$ and $y$ into square matrices $X,Y$, you could write this as $$ Y = AXA^T. $$ If we want a derivative in some kind of matrix form, we can compute the partial derivative of $Y$ with respect to the $i,j$ entry of A. To that end, for $h \in \Bbb R$, we can write $$ \begin{align} Y(A + h E_{ij}) &= (A + hE_{ij})X(A + hE_{ij})^T \\ & = AXA^T + h(E_{ij} X A^T + AXE_{ij}^T) + o(h) \\ & = Y(A) + h \frac{\partial Y}{\partial a_{ij}} + o(h). \end{align} $$ With that, we have an expression for the desired partial derivative. In terms of the Kronecker delta, the $p,q$ entry of $\frac{\partial Y}{\partial a_{ij}}$ is given by $$ \left[\frac{\partial Y}{\partial a_{ij}}\right]_{p,q} = \delta_{ip} \left(\sum_{k=1}^n x_{jk}a_{qk} \right) + \delta_{iq}\left(\sum_{k=1}^n a_{pk}x_{kj} \right). $$

2
On

$\def\D{\delta}\def\E{{\cal E}}\def\F{{\cal F}}\def\p#1#2{\frac{\partial #1}{\partial #2}}$The fourth-order tensors $$\eqalign{ \p{X}{X} &= \E \quad&\implies\quad \E_{ijk\ell} = \D_{ik}\D_{j\ell} \\ \p{X^T}{X} &= \F \quad&\implies\quad \F_{ijk\ell} = \D_{i\ell}\D_{jk} \\ }$$ can be used to rearrange matrix products and transpose matrices $$\eqalign{ AXB^T &= \big(A\E B\big):X, \qquad&A:\E&=\E:A&=A \\ ABX^T &= \big(AB\F\big):X, \qquad&A:\F&=\F:A&=A^T \\ }$$ Write the function in matrix form, as suggested in Ben's answer.
Then calculate the differential and gradient. $$\eqalign{ Y &= AXA^T \\ dY &= dA\,XA^T + AX\,dA^T \\ &= \big(\E AX^T + AX\F\big):dA \\ \p{Y}{A} &= \E AX^T + AX\F \\ }$$ So that's the tensor form. In component form it looks like this $$\eqalign{ \p{Y_{ij}}{A_{k\ell}} ​&= \E_{ijkm}A_{mn}X_{\ell n} + A_{im}X_{mn}\F_{njk\ell} \\ ​&= \D_{ik}\D_{jm}A_{mn}X_{\ell n} + A_{im}X_{mn}\D_{n\ell}\D_{jk} \\ ​&= \D_{ik}A_{jn}X_{\ell n} + A_{im}X_{m\ell}\D_{jk} \\ }$$