Projection of a symmetric matrix onto the positive semidefinite (PSD) cone under the nuclear norm

1k Views Asked by At

Question: Given a symmetric matrix, $S$, what is the solution to the optimization problem $$ \arg\min_{P \in \mathcal{S}_{\ge 0}} \| S - P \|_N $$ where $\| \cdot \|_N$ denotes the nuclear norm, i.e. the sum of the singular values, or equivalently in the case of a symmetric matrix, the sum of the absolute value of the eigenvalues?

In other words, given a symmetric matrix $S$, what is the closest matrix $P \succeq 0$ in the positive semidefinite cone (denoted $\mathcal{S}_{\ge 0}$) with respect to the nuclear norm?


Guess: I would guess that the answer is the same as for the spectral norm and the Frobenius norm, provided that the nuclear norm is unitarily invariant (is it? I don't know, and I can't find a reference). Namely the answer in those cases is apparently:

$$ V \Lambda^+ V^\top $$

where $V \Lambda V^\top$ is the eigenvalue decomposition of $S$, and $V^+$ is the diagonal matrix that is the same as $\Lambda$ except that any negative entries have been replaced with zero.

The following argument is obviously wrong, but imagine we would have that for any symmetric matrix $R = U K U^\top$, after applying a rotation to get $\tilde{R} = V K V^\top$, we would still have that (in fantasy land) $||S - R||_N = ||S - \tilde{R}||_N$, then we could assume without loss of generality that $U = V = I$, and then so calculating the nuclear norm of the difference would reduce to finding the nuclear norm $|| \Lambda - K ||_N$. But this is easy, since it's just the sum of the absolute values of the diagonals, $\sum_i| \lambda_i - \kappa_i|$, and then the only way to minimize this (is the following claim actually true?), while ensuring that all of the $\kappa_i \ge 0$, is to take $\kappa_i = \max\{ \lambda_i , 0 \}$, i.e. $K = \Lambda^+$.

Of course the claim about taking $V=U=I$ without loss of generality is obviously wrong, since then there would be no unique minimizer/no unique solution to the optimization problem, but the PSD cone is convex and the nuclear norm is a convex function.

However, it would seem to suffice to be able to show that $$ || V \Lambda V^\top - V K V^\top||_N < || V \Lambda V^\top - U K U^\top||_N$$ for any $U$ orthogonal and $U \not=V$. So I guess two claims of questionable merit:

Claim 1: The minimizer of $$ \min_{i, \kappa_i \ge 0} \sum_{i} | \lambda_i - \kappa_i | $$ is $\kappa_i = \max \{0, \lambda_i\}$ for all $i$.

Claim 2: $$ || V \Lambda V^\top - V K V^\top||_N < || V \Lambda V^\top - U K U^\top||_N$$ for any $U$ orthogonal and $U \not=V$. (And any diagonal matrix $K$.)

The following questions are all related but don't answer my question, because they either involve projection onto a set different than the PSD cone, or projection with respect to a distance different from that induced by the nuclear norm (or w.r.t. an unspecified distance). So this question does not seem to be a duplicate of any of these.