I am trying to compute the derivative
$\frac{\partial}{\partial W} \text{Tr}(W^\top A (I\otimes W)B),$
where $W\in\mathbb{R}^{D\times d}, I\in\mathbb{R}^{T\times T}$ is an identity matrix, $A\in\mathbb{R}^{D\times DT}$, and $B\in\mathbb{R}^{dT\times d}$.
I have found a similar post: Derivative involving the trace of a Kronecker product
but it seems that the method is not applicable to my problem.
Thank you!
The technique from the linked post can be applied to the current problem.
Write the function in terms of the trace/Frobenius product, and find its differential $$\eqalign{ \phi &= W:A(I\otimes W)B = A^TWB^T:(I\otimes W) \cr d\phi &= A(I\otimes W)B:dW + A^TWB^T:(I\otimes dW) }$$ At this point, we need use the Pitsianis decomposition on that last term. $$\eqalign{ A^TWB^T &= \sum_k Y_k\otimes Z_k \cr }$$ The matrices $(Y_k,Z_k)$ are shaped like $(I,W)$ respectively.
Finish calculating the differential, then on to the gradient. $$\eqalign{ d\phi &= A(I\otimes W)B:dW + \sum_kY_k\otimes Z_k:(I\otimes dW) \cr &= \Big(A(I\otimes W)B + \sum_k(I:Y_k)Z_k\Big):dW \cr \frac{\partial\phi}{\partial W} &= A(I\otimes W)B + \sum_k {\rm tr}(Y_k)\,Z_k \cr\cr }$$
Another technique uses the SVD of $$B=\sum_k\sigma_ku_kv_k^T$$ to handle the second term of $d\phi$ as follows. $$\eqalign{ A^TW:(I\otimes dW)B &= \sum_k\,A^TW:(I\otimes dW)\sigma_ku_kv_k^T \cr &= \sum_k\,(A^TW\sigma_kv_k):(I\otimes dW)u_k \cr &= \sum_k\,q_k:{\rm vec}(dW\,U_k) \cr &= \sum_k\,Q_k:dW\,U_k \cr &= \sum_k\,Q_kU_k^T:dW \cr }$$ where $$\eqalign{ {\rm vec}(Q_k) &= q_k = A^TW\sigma_kv_k \cr {\rm vec}(U_k) &= u_k \cr }$$ Yielding the gradient as $$\eqalign{ \frac{\partial\phi}{\partial W} &= A(I\otimes W)B + \sum_k Q_kU_k^T \cr }$$