I have a function $f(\Theta) = \frac{1}{2N}\| y-\mathcal{X}(\Theta)\|_2^2$. Matrix $\Theta\in\mathbb{R}^{m_1\times m_2}$, $y=[y_1,\cdots,y_N]^T\in\mathbb{R}^N$ is the observation vector, and we use the observation matrices $\{X_i\}_{i=1}^N$ for each $X_i\in\mathbb{R}^{m_1\times m_2}$ to define an operator from $\mathbb{R}^{m_1\times m_2}\mapsto \mathbb{R}^N$ via $[\mathcal{X}(\Theta)]_i=trace(X_i^T \Theta)$.
I would like to find the gradient $\triangledown f(\Theta)$. Let $f(\Theta)=g(h(\Theta))$ where $g(z)= \frac{1}{2N}\|z\|_2^2$ and $h(\Theta)=y-\mathcal{X}(\Theta)$. Then $\triangledown f(\Theta)= \triangledown g\cdot \triangledown h$. It is easy to have $\triangledown g(z)=\frac{z}{N}$. Substitute $z=y-\mathcal{X}(\Theta)$ would give $\triangledown g$.
How would I find $\triangledown h(\Theta)$?
Rewrite your function as $$f(\Theta)=\frac1{2N}\sum_{i=1}^N|y_i-\mathrm{tr}X_i^T\Theta|^2$$ and hence $$\nabla_\Theta f(\Theta)=-\frac1N\sum_{i=1}^N|y_i-\mathrm{tr}X_i^T\Theta|~\nabla_\Theta\mathrm{tr}X_i^T\Theta=-\frac1N\sum_{i=1}^N|y_i-\mathrm{tr}X_i^T\Theta|~X_i$$ where the last equality follows from the fact $\nabla_X\mathrm{tr}(AXB)=A^TB^T$