Derivative of a matrix-vector multiplication with respect to the matrix

53 Views Asked by At

Let's assume that we have $\textbf{x} \in \mathbb{R}^n$ and $A \in \mathbb{R}^{m \times n}$. I want to show that $\frac{\partial(\textbf{Ax})}{\partial\textbf{A}} = \textbf{x}^T$, e.g. in [0], where we have:

$$\frac{\partial\textbf{W}_3\textbf{x}_2}{\partial\textbf{W}_3} = \textbf{x}_2^T$$

for $\frac{\partial E}{\partial\textbf{W}_3}$. As far as I understand, we have $\textbf{Ax} \in \mathbb{R}^{m}$, so in reality we differentiate the vector with respect to the matrix, what gives us a tensor with its 3D representation. I'm not sure how this can be equal to $\textbf{x}^T$.


[0] Sudeep Raja, A derivation of backpropagation in matrix form, August 17, 2016.