I have a regression problem as below. $$ A^* = \operatorname*{argmin}_A \| Y - AX \|_F^2 $$ where $Y$ and $X$ are matrices, and $A$ is a mapping matrix between these two matrices.
There are analytic solutions for the above problem, which is given as below. $$ \tilde{A} = Y X^\dagger $$ and $$ \bar{A} = (Y X^\top) (X X^\top)^\dagger $$ where $\dagger$ is pseudo-inverse.
I think they should have the same result. But, when I try to verify the accuracy of both solutions, they are different: $\varepsilon_1 \neq \varepsilon_2$ where \begin{align} \varepsilon_1 = \| Y - \tilde{A}X \|_F^2, \varepsilon_2 = \| Y X^\top - \bar{A} (X X^\top)\|_F^2 \end{align}
In my naive thought, $\varepsilon_1 = \varepsilon_2$. Why is it different?