I'm a CS Student and I've implemented the Convex Non-Negative Matrix Factorization (Convex-NMF) Algorithm for a project.
Now, for "classic" NMF algorithms, you get an approximation:
$$ \mathbf{A} \approx \mathbf{WH} $$
with the goal is to minimize
$$ \| \mathbf{A-WH} \|_F^2 $$
which, according to [1] can be written as $$ \| \mathbf{A-WH} \|_F^2 = trace(\mathbf{A^TA}) - 2*trace(\mathbf{H^TW^TA}) + trace(\mathbf{H^TW^TWH}) $$
I've tried to understand it, and apply it to Convex-NMF, where the approximation is written as $\mathbf{X} \approx \mathbf{XWG^T}$
So I thought that: $$ \| \mathbf{x-XWG^T} \|_F^2 = trace(\mathbf{X^TX}) - 2*trace(\mathbf{X^TXWG^T}) + trace(\mathbf{X^TXWW^TGG^T}) $$
where $$ \mathbf{X} \in \mathbb{R}^{m\times n}, \quad \mathbf{W,G} \in \mathbb{R}^{n\times k} $$ (and some more constraints, that don't matter here). Well, the result I get is wrong, so I guess I messed up the math somewhere (which isn't my strong suit).
[1] Amy N. Langville, Michael W. Berry, Murray Browne, V. Paul Pauca, and Robert J. Plemmons. Al- gorithms and applications for approximate nonnegative matrix factorization. Computational Statistics and Data Analysis, 52(1):155–173, 2007.
We calculate $$ \DeclareMathOperator{\tr}{trace} \|X - XWG^T\|_F^2 = \\ \tr[(X - XWG^T)^T(X - XWG^T)] =\\ \tr(X^TX) - 2\tr(X^TXWG^T) + \tr(GW^TX^TXWG^T) $$ Note, however, that order matters in that last term.