Say, $y = W^T b$ where, $ W^T \in R^{2 \times 2} $ and $ b \in R^{2 \times 1} $. Now, we want to calculate: $\frac{\partial y}{\partial W}$.
First, if we look at the dimensions: $y \in R^{2 \times 1}$. So, we know that $\frac{\partial y}{\partial W} \in 2 \times 2 \times 2$ - so it will be a third order tensor?
Second, is the following correct: $\frac{\partial y}{\partial W} = b^T \bigotimes I_{2 \times 2}$?
Are there any good resources to read up on when it comes to matrix derivatives? especially, understanding the dimensions, etc.?
According to my calculations,
$$\begin{matrix}\frac{dy_1}{dW_{11}}=b_1&\frac{dy_1}{dW_{12}}=b_2&\frac{dy_1}{dW_{21}}=0&\frac{dy_1}{dW_{22}}=0\\ \frac{dy_2}{dW_{11}}=0&\frac{dy_2}{dW_{12}}=0&\frac{dy_1}{dW_{21}}=b_1&\frac{dy_2}{dW_{22}}=b_2\end{matrix}$$
So it looks like $b^T\bigotimes I_{2\times 2}$ would give you $$\begin{pmatrix}b_1&b_2\end{pmatrix}\bigotimes\begin{pmatrix}1&0\\0&1\end{pmatrix}\\ =\begin{pmatrix}b_1\begin{pmatrix}1&0\\0&1\end{pmatrix}&b_2\begin{pmatrix}1&0\\0&1\end{pmatrix}\end{pmatrix}\\ =\begin{pmatrix}b_1&0&b_2&0\\0&b_1&0&b_2\end{pmatrix}$$
which matches
$$\begin{pmatrix}\frac{dy_1}{dW_{11}}&\frac{dy_1}{dW_{21}}&\frac{dy_1}{dW_{12}}&\frac{dy_1}{dW_{22}}\\ \frac{dy_2}{dW_{11}}&\frac{dy_2}{dW_{21}}&\frac{dy_2}{dW_{12}}&\frac{dy_2}{dW_{22}}\end{pmatrix}$$
if you write it a certain way.