How to calculate this matrix derivative:
$\frac{\partial W^TWW^TW}{\partial W}$
W is a matrix.
I would appreciate any helps on this.
How to calculate this matrix derivative:
$\frac{\partial W^TWW^TW}{\partial W}$
W is a matrix.
I would appreciate any helps on this.
Copyright © 2021 JogjaFile Inc.
First define two new matrices and their differentials. $$\eqalign{ Y &= W^TW &\implies &dY = W^TdW + dW^TW \cr F &= Y^TY =Y^2 &\implies &dF = Y\,dY + dY\,Y \cr }$$ Then expand $dF$, vectorize it, and calculate the gradient. $$\eqalign{ dF &= YW^TdW + YdW^TW + dW^TW\,Y + W^TdW\,Y \cr {\rm vec}\big(dF\big) &= {\rm vec}\Big(YW^TdW + YdW^TW + dW^TW\,Y + W^TdW\,Y\Big) \cr df &= \Big(I\otimes YW^T+Y\otimes W^T\Big)\,dw + \Big(W^T\otimes Y+YW^T\otimes I\Big)K\,dw \cr \frac{\partial f}{\partial w} &= \Big(I\otimes YW^T+Y\otimes W^T\Big) + \Big(W^T\otimes Y+YW^T\otimes I\Big)K \cr &= \Big(I\otimes W^TWW^T+W^TW\otimes W^T\Big) + \Big(W^T\otimes W^TW+W^TWW^T\otimes I\Big)K \cr }$$ where $K$ is the Commutation Matrix associated with the vec-operation.