I'm new to Matrix Calculus. Recently I've been working on that and have a question. Please see the following:
$J=J(\mathbf{z})$
$\mathbf{z}=\mathbf{W}\mathbf{a}$
Where $J: R^m \rightarrow R$, $\mathbf{z}$ is $m\times1$ vector, $\mathbf{a}$ is $n\times1$ vector and $\mathbf{W}$ is $m\times n$ matrix.
I want to calculate $\frac{\partial{J}}{\partial{\mathbf{W}}}$. A reference paper tells I need to turn $\mathbf{W}$ to vector by stacking the column:
$\frac{\partial{J}}{\partial{vec(\mathbf{W})}} = \frac{\partial J}{\partial \mathbf{z}}\cdot\frac{\partial \mathbf{z}}{\partial vec(\mathbf{W})}$
let $\delta^T = \frac{\partial J}{\partial \mathbf{z}}$ and it is a $1\times m$ vector (numerator layout).
$\frac{\partial \mathbf{z}}{\partial vec(\mathbf{W})} = \frac{\partial \mathbf{W}\mathbf{a}}{\partial vec(\mathbf{W})}=\frac{\partial vec(\mathbf{W}\mathbf{a})}{\partial vec(\mathbf{W})} = \frac{\partial (\mathbf{a}^T \otimes I_{mm})vec(\mathbf{W})}{\partial vec(\mathbf{W})}=\mathbf{a}^T \otimes I_{mm}$, $\otimes$ is Kronecker product.
$\mathbf{a}^T \otimes I_{mm}$ is $m\times mn$ matrix.
$\delta^T\cdot(\mathbf{a}^T \otimes I_{mm}) = [\delta_1\cdot \mathbf{a}^T, \delta_2\cdot \mathbf{a}^T, ..., \delta_m\cdot \mathbf{a}^T]$. If I recover this to matrix by invert stacking column, the result is strage: $$ \begin{matrix} \delta_1a_1 & \delta_{1}a_{m+1} & \dots \\ \delta_1a_2 & \delta_{1}a_{m+2} & \dots \\ \vdots & \vdots & \dots \\ \delta_1a_m & \delta_{2}a_{2m-n} & \dots \end{matrix} $$
This is obviously wrong. It looks strange. And I find another reference, the result should be $\mathbf{\delta}\cdot \mathbf{a}^T$. So I think the $invert\ vec(\cdot)$ should be row stacking, is it right?
emm.... I know where I am wrong. Actually, I took a shower before went to bed yesterday, and during the shower, I knew the mistake.
This result, $\delta^T\cdot(\mathbf{a}^T \otimes I_{mm})$ is right, while this equation $\delta^T\cdot(\mathbf{a}^T \otimes I_{mm}) = [\delta_1\cdot \mathbf{a}^T, \delta_2\cdot \mathbf{a}^T, ..., \delta_m\cdot \mathbf{a}^T]$ is wrong.
It should be:
$\delta^T\cdot(\mathbf{a}^T \otimes I_{mm}) = [a_1\cdot \mathbf{\delta}^T, a_2\cdot \mathbf{\delta}^T, ..., a_m\cdot \mathbf{\delta}^T]$
After inverting this row vector (numerator layout) back to matrix, the final result is: $\delta\cdot\mathbf{a}^T$.
I misunderstood the definition of Kronecker product.