The problem is formulated as follows:
Let $U\in\mathbb{R}^{d\times R}$ denote the variable matrix. $A_i\in\mathbb{R}^{d\times d}$ and $y_i$ is a scalar $$ f(U) = \frac{1}{2m}\sum^m_{i=1}(\langle{A_i, UU^T}\rangle-y_i)^2 $$
Compute the gradient of $f(U)$ over $U$.
I have never dealt with inner products much before and am confused how to take the gradient for this function. I know that
$$ \frac{d}{dt} \langle f, g \rangle = \langle f(t), g^{\prime}(t) \rangle + \langle f^{\prime}(t), g(t) \rangle $$
But if you do this then the matrix dimensions do not work. Also not sure if this is even correct as you are dealing with a variable matrix. Any help would be great as I am pretty lost on how to begin.
For ease of typing, I'll use a colon to denote the inner product, i.e. $$A:B = \langle A,B\rangle$$ Define a vector $v$, whose $i^{th}$ component is given by $$v_i = A_i:(UU^T) - y_i$$ and a matrix $M$ equal to the mean of the $A_i$ matrices (weighted by the components of $v$) $$M = \frac 1m\sum_{i=1}^m v_iA_i$$ Then the function can be written as $$f = \frac 1{2m}\;\sum_{i=1}^m v_i\,v_i$$ Calculate the gradient of this function as follows $$\eqalign{ df &= \frac 1m\;\sum_{i=1}^m v_i\,dv_i \\ &= \frac 1m\;\sum_{i=1}^m v_i\,A_i:d(UU^T) \\ &= M:d(UU^T) \\ &= M:(dU\,U^T+U\,dU^T) \\ &= (M+M^T):(dU\,U^T) \\ &= \left(M+M^T\right)U:dU \\ \frac{\partial f}{\partial U} &= \left(M+M^T\right)U \\ }$$