Switching order of matrices in matrix calculus equations

30 Views Asked by At

My professor (during deriving the closed form solution for Linear Regression) said

$\frac{\partial \theta^\intercal X^\intercal X\theta}{\partial \theta} = 2X^\intercal X \theta$

What are the rules for deciding to write $\theta^\intercal X^\intercal X\theta$

and not $\theta^2 X^\intercal X$

or to write $2 X^\intercal X \theta$

and not $2 \theta X^\intercal X$

If the reason is to make sure the dimensions match for matrix multiplication - then is it possible to reorder any terms in a multiplication to get a result in the dimensions you want or are there rules?