Variational characterization of nuclear norm

1.6k Views Asked by At

The nuclear norm $||\cdot||_{*}$ of a matrix is defined as the sum of its singular values. Working from the result at the bottom of this blog post, we have, for a matrix $\mathbf{X}$ and its decomposition $\mathbf{L} \mathbf{R}^T$,

$$ \|\mathbf{X}\|_* = \min_{\mathbf{X=LR}^T} \|\mathbf{L}\|_F \|\mathbf{R}\|_F = \min_{\mathbf{X=LR}^T} \frac{1}{2} \left(\|\mathbf{L}\|_F^2 + \|\mathbf{R}\|_F^2\right) $$

where, for example, $\mathbf{L} = \mathbf{U\Sigma}^{1/2}$, $\mathbf{R} = \mathbf{\Sigma}^{1/2} \mathbf{V}^T$, and $\mathbf{X} = \mathbf{U\Sigma V}^T$ is the SVD of $\mathbf{X}$.

The basis for this is the paper by Recht et al. from 2007: "Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization".

In a paper on Robust Principal Component Analysis (RPCA) by Feng et al., this result is used to reformulate the RPCA problem so as to minimize the nuclear norm without needing to access all the samples to perform an SVD calculation (see Eq.2 in that paper).


I'm interested in a possible extension to the case of two-dimensional SVD (original paper and Wikipedia article), where for a group of matrices $(\mathbf{X}_1,...,\mathbf{X}_n)$, we instead have the following type of decomposition:

$$ \mathbf{X}_i = \mathbf{L} \mathbf{M}_i \mathbf{R}^T $$

Is the following correct?

$$ \|\mathbf{X}_i\|_* = \min_{\mathbf{X}_i=\mathbf{L}\mathbf{M}_i\mathbf{R}^T} \frac{1}{2} \left(\|\mathbf{L}\|_F^2 + \|\mathbf{M}_i\|_F^2 + \|\mathbf{R}\|_F^2\right) $$

Or am I being far too hopeful for such a simple solution...