I am stuck at computing the derivative of the following hadamard product:
$$ \frac{\partial}{\partial W} (\vec{a}*W) \circ (\vec{b}*W) $$
W is a random initialized matrix. I read the post to Derivative of Hadamard product but I still don't understand how to perform the calculation. Do I just have to use the product rule here?
If we denote $\vec{a}*W$ as the usual product of the vector $\vec{a}$ (writen as a row matrix) and the matrix $W=[w_1,\,w_2,\,\ldots,\, w_n]$, in which $w_i$ is a column vector. Then we can see that $$ (\vec{a}*W)=[\vec{a}*w_1,\,\vec{a}*w_2,\,\ldots,\, \vec{a}*w_n],$$ in which $a*w_i$ is a number. It follows that $$f(W)= (\vec{a}*W) \circ (\vec{b}*W)=[(\vec{a}*w_1)(\vec{b}*w_1),\,(\vec{a}*w_2)(\vec{b}*w_2),\,\ldots,\,(\vec{a}*w_n)(\vec{b}*w_n)],$$ is a row matrix (or a vector). We can see that $$\frac{\partial (\vec{a}*w_j)(\vec{b}*w_j)}{\partial w_i} =\left\{\begin{array}{rr}(\vec{a}*w_j)\vec{b}+(\vec{b}*w_j)\vec{a},&\quad i=j\\0,&\quad i\neq j\end{array}\right. ,$$ and you can see $$\frac{\partial (\vec{a}*W) \circ (\vec{b}*W)}{\partial W}$$ as the Jacobian matrix of $f(W)$, for instance, when you identify $W$ and $vec(W)$ (the vectorization of $W$).
You can also see that $$f(W+H)=f(W)+\left((\vec{a}*H) \circ (\vec{b}*W)+(\vec{a}*W) \circ (\vec{b}*H)\right)+(\vec{a}*H) \circ (\vec{b}*H),$$ which implies that $$\frac{\partial (\vec{a}*W) \circ (\vec{b}*W)}{\partial W}H=(\vec{a}*H) \circ (\vec{b}*W)+(\vec{a}*W) \circ (\vec{b}*H),$$ when you see $$\frac{\partial (\vec{a}*W) \circ (\vec{b}*W)}{\partial W}$$ as a linear transformation.