I would like to take a derivative of the following expression wrt vector $x\in\mathbb{R}^d$ $$ W\mathrm{diag}(f(Ax+b)) $$ where $f$ is some smooth element-wise function, $A\in \mathbb{R}^{K\times d}$, $b\in\mathbb{R}^K$ and $W\in\mathbb{R}^{m\times K}$. I realise the result might be a higher dimensional tensor, but I do not not how to proceed.
Edit: I tried writing the expression using the Hadamard product: $$\left\{\mathbf{1}(f(Ax+b))^\top\right\} \circ W$$ however this is still not something I would know how to work with.
Let $F(x):=W\mathrm{diag}(f(Ax+b))$, then we can simply rewrite it as
$$F(x)=W\sum_{j=1}^Ke_je_j^Tf(e_j^TAx+b_j)$$
where $\{e_i\}$ is the natural basis for $\mathbb{R}^K$ and $b=[b_i]$. This is a simple rewriting of
The most convenient representation (at least to me) of the derivative with respect to $x$ is then
$$\dfrac{\partial F(x)}{\partial x_i}=W\sum_{j=1}^Ke_je_j^Ta_{ji}f'(e_j^TAx+b_j)$$
where $A=[a_{ij}]$, $f'$ is the derivative of $f$, and where we have applied the usual derivative rules of derivation.
This can be rewritten in terms of the initial "diag" operator as
$$\dfrac{\partial F(x)}{\partial x_i}=WD_i\mathrm{diag}(f'(Ax+b))$$
where $D_i:=\mathrm{diag}(a_{1i},\ldots,a_{Ki})$.