I have the matrix equation $$ \boldsymbol{\Phi}\boldsymbol{w}=\boldsymbol{y} $$ where $\boldsymbol{y}$ has size $[v\times d]$, $\boldsymbol{\Phi}$ has size $[v\times c]$ and $\boldsymbol{w}$ has size $[c\times d]$. I am trying to compute $\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{w}}$. It is given in the problem statement that $\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{w}}$ has size $[[v\times d]\times[c\times d]]$. This is where I get confused. I understand that a vector $\mathbf{z}$'s derivative w.r.t a vector $\mathbf{x}$ has the form
$$\frac{\partial \mathbf{z}}{\partial \mathbf{x}}=\left[\begin{array}{cccc}\frac{\partial z_{1}}{\partial x_{1}} & \frac{\partial z_{1}}{\partial x_{2}} & \cdots & \frac{\partial z_{1}}{\partial x_{n}} \\ \frac{\partial z_{2}}{\partial x_{1}} & \frac{\partial z_{2}}{\partial x_{2}} & \cdots & \frac{\partial z_{2}}{\partial x_{n}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial z_{m}}{\partial x_{1}} & \frac{\partial z_{m}}{\partial x_{2}} & \cdots & \frac{\partial z_{m}}{\partial x_{n}}\end{array}\right].$$
If my matrices where vectors this would have made sense to me. But I'm confused as two how to proceed to compute $\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{w}}$. Any suggestions or pointing me to a similar example would be appreciated.
The matrix equation $$Y=\Phi W$$ can be vectorized to a matrix-vector equation $$\eqalign{ {\rm vec}(Y) &= (I\otimes\Phi)\;{\rm vec}(W) \\ y &= (I\otimes\Phi)w \\ }$$ whose gradient is the matrix $$\eqalign{ \frac{\partial y}{\partial w} &= (I\otimes\Phi) \\ }$$ Rather than vectorization, one can use the fourth-order tensor $\cal H$ with components
$${\cal H}_{ijk\ell} = \delta_{ik}\,\delta_{j\ell}$$ which is the identity tensor for the double-dot product, i.e. for an arbitrary matrix $A$ $$A = A:{\cal H}={\cal H}:A$$ This allows the gradient to be calculated directly as
$$\eqalign{ dY &= \Phi\,dW = \Phi({\cal H}:dW) \\ \frac{\partial Y}{\partial W} &= \Phi{\cal H} \\ }$$ which is itself a fourth-order tensor.
In component form it reads $$\eqalign{ \frac{\partial Y_{ij}}{\partial W_{k\ell}} &= \sum_{n=1}^c \Phi_{in}{\cal H}_{njkl} \;=\; \Phi_{ik}\delta_{jl} \\ }$$