Suppose $\mathbf{X}$ is a p x n matrix, $\mathbf{Y}$ is q x n, $\mathbf{C}$ is an unknown q x p matrix. Can you minimize the following with gradient descent to find C? (multivariate regression)
$|| \mathbf{Y}-\mathbf{C}\mathbf{X}||^2_F$
Wouldn't the gradient wrt $\mathbf{C}$ just be
$\nabla g = -2 (\mathbf{Y}-\mathbf{C}\mathbf{X}) \mathbf{X}'$
Your function can be written in the form $$ g(C) = \operatorname{tr}[(Y - CX)(Y - CX)'] = \operatorname{tr}[CXX'C] - 2\operatorname{tr}[CXY'] + [\text{const}] $$ Following this table, the gradient (either the transpose of the derivative in the left side column, or equivalently the derivative in the right-most column) will be $$ \frac{\partial g}{\partial C} = [(Y - CX)(-X') + (-X)(Y - CX)']' - [2XY']' \\ = -X(Y - CX)' - (Y - CX)X' - 2YX'. $$
For a more abstract approach, note that
$$ g(C+H) = \operatorname{tr}[(Y - [C+H]X)(Y - [C+H]X)']\\ = g(C) - \operatorname{tr}[HX(Y - CX)'] - \operatorname{tr}[(Y - CX)X'] + o(H)\\ = g(C) - 2\operatorname{tr}[HX(Y - CX)'] + o(H). $$ We therefore find that $dg = -\operatorname{tr}[2X(Y - CX)' dC]$, so that the gradient is equal to $$ -[X(Y - CX)']' = -2(Y - CX)X'. $$ This matches your result.