Eq. 101 of The Matrix Cookbook claims that
$$\frac{\partial \operatorname {Tr}(AXB)}{\partial X} = A^\top B^\top$$
The point that is unclear to me is why the dimensions in the product fit? For example, if $A$ is a $2 \times 3$ matrix, $X$ is a $3 \times 5$ matrix and $B$ is a $5 \times 6$ matrix, $A^\top$ is $3 \times 2$ and $B^\top$ is $6 \times 5$, so their product does not seem to make any sense. Am I reading it incorrectly? Is this formula true only for square matrices? If so, what extensions exist for general matrices?