I’m trying to compute the gradient of the matrix function $g(W) = \|\vec{y}-\sigma(W\vec{x})\|_2^2$ with respect to $W$ where $\sigma = \sin(x)$ is applied elementwise to the vector. So far I have
$$ (\nabla g)_{ij} = \frac{\partial g}{\partial w_{ij}} = \frac{\partial}{\partial w_{ij}} \left[\sum_{k=1}^N (y_k - \sigma(W\vec{x})_k)^2\right] = \frac{\partial}{\partial w_{ij}} \left[\sum_{k=1}^N \left(y_k - \sin\left(\sum_{l=1}^M w_{kl}x_l \right)\right)^2\right] $$
I’m not sure what to do next, and I’m sure that there’s a much easier way to do it without any expansions but I’m not sure how to take derivatives with respect to $w_{ij}$.
$ \def\a{\alpha}\def\b{\beta}\def\g{\gamma}\def\t{\theta} \def\l{\lambda}\def\s{\sigma}\def\e{\varepsilon} \def\n{\nabla}\def\o{{\tt1}}\def\p{\partial} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} $For typing convenience, define the vector variables $$\eqalign{ v &= Wx &\qiq dv = dW\,x \\ c &= \cos(v) &\qiq C = \Diag c \\ s &= \sin(v) &\qiq ds = c\odot dv = \c{C\,dv} \\ p &= s-y &\qiq dp = ds = \c{C\,dW\,x} \\ }$$ and use them to write the cost function and calculate its differential and gradient $$\eqalign{ g &= p:p \\ dg &= dp:p + p:dp \\ &= 2p:\c{dp} \\ &= 2p:\CLR{C\,dW\,x} \\ &= 2\LR{Cpx^T}:dW \\ \grad{g}{W} &= 2\,Cpx^T \\ }$$ where $(\,:/\,\odot)$ denote the Frobenius/Hadamard products, respectively.