This is a part of the derivation of the normal equation and I am struggling with this part.
I don't get how $\operatorname{tr}\theta^TX^TX{\theta}$ can become $2X^TX\theta$....
I know that the derivative of $\operatorname{tr}(ABA^TC)$ respect to $A$ is equal to $CAB + C^TAB^T$ and the lecturer seems that he wants me to use this to derive it, but I don't get how I should use it.
The picture is the part of the lecture note that I'm struggling with.

\begin{align} & \frac d {d\theta} \operatorname{tr}(\theta^TX^TX\theta) \\[12pt] = {} & \frac d {dA}\operatorname{tr}(ABA^TC) \\[4pt] & \text{with $\theta^T$ in the role of } A, \\ & \text{$X^TX$ in the role of } B, \\ & \text{and } I \text{ in the role of } C \\[12pt] = {} & CAB + C^TAB^T \quad \text{(This was given.)} \\[10pt] = {} & \theta^T X^TX + \theta^T X^TX.{} \end{align} Here, $B$ and $C$ must be square matrices, and their sizes differ if $A$ is not a square matrix.