Let $X = \begin{bmatrix} x_{11} & x_{12} & x_{13} & x_{14} \\ x_{21} & x_{22} & x_{23} & x_{24} \end{bmatrix}$
and $Y = \begin{bmatrix} y_{11} \\ y_{21} \\ y_{31} \\ y_{41} \end{bmatrix}$
Assume $Z = XY = \begin{bmatrix} z_{11} \\ z_{21} \end{bmatrix} = \begin{bmatrix} x_{11}y_{11} + x_{12}y_{21} + x_{13}y_{31} + x_{14}y_{41} \\ x_{21}y_{11} + x_{22}y_{21} + x_{23}y_{31} + x_{24}y_{41} \end{bmatrix}$
How can we prove that, $\frac{\mathrm{d} Z}{\mathrm{d} X} = Y^T$ ?
I am asking this question to understand the backpropagation while training a neural network.