In paper " Generalized Low Rank Approximations of Matrixces
``[...] the minimization in Eq. (1) is equivalent to minimizing $$\sum_{i=1}^n\mbox{trace}(D_iD_i^T) - 2\sum_{i=1}^n\mbox{trace}(LD_iR^{T}A_i^{T}).~~~~~~~~~~~(3)$$ It is easy to check that the minimum of $(3)$ is obtained, only if $D_i= L^{T}A_iR$, for every $i$. This completes the proof of the theorem. $\square$''
The dimensions of the matrices are as follows:
$A_i$ is $r \times c$
$L$ is $r \times l_1$
R is $c \times l_2$
$D_i$ is $l_1 \times l_2$
why it says $D_i = L^T \cdot A_i \cdot R$ is the optimal solution? we substitue the solution to the objective function, it come out zeros but how to proof this is optimal? Thanks
Equation (3) is seperable for all $i$, thus, you can differentiate (3) with respect to $D_i$ to get $$ 2D_i-2L^TA_iR = 0 $$ since $\frac{\partial}{\partial X}\text{Trace}(XX^T)=X+X=2X$ and $\frac{\partial}{\partial X}\text{Trace}(CXB)=C^TB^T$. Thus $D_i=L^TA_iR$ for each $i$.