I am trying to understand the answer of this question. How do you get this?
$$\nabla_{\mathrm W}\left(\mbox{tr} \left( \mathrm W^{\top} \mathrm X^{\top} \mathrm X \mathrm W - \mathrm Y^{\top} \mathrm X \mathrm W - \mathrm W^{\top} \mathrm X^{\top} \mathrm Y + \mathrm Y^{\top} \mathrm Y \right)\right)$$ $$= 2 \, \mathrm X^{\top} \mathrm X \mathrm W - 2 \, \mathrm X^{\top} \mathrm Y$$
Specifically, I want to know what kind of magic happens to these:
$$-\mathrm Y^{\top} \mathrm X \mathrm W - \mathrm W^{\top} \mathrm X^{\top} \mathrm Y$$
Thank you so much.
The trace of a matrix is equal to the trace of the transpose. So
$$\operatorname{tr}(Y^T XW+ W^TX^TY)= 2\operatorname{tr}(Y^TXW)$$ and
$$W \mapsto 2\operatorname{tr}(Y^TXW)$$ is linear so its derivative is equal to itself. See derivative in product in trace if required.