Do $\ell_{\infty}$ and $\ell_{2,1}$-norms give similar results? and why?

90 Views Asked by At

I am aware that $ \ell_{2,1} $-norm is being used for inducing a structure in the sparse matrix. I was told by someone that $ \ell_{\infty} $ and $ \ell_{2,1} $-norms give similar results. But how and why is that?

What are the differences, similarities, or advantages of $ \ell_{\infty} $-norm versus $ \ell_{2,1} $-norm in the case of norm minimization algorithms?

More specifically this question is in the context of Robust PCA (low-rank and sparse decomposition of a two dimensional matrix) where the mentioned norms are used to induce structured sparsity in the sparse part.

Robust PCA solves the matrix decomposition problem

$$L^*, S^* = \arg \min_{L, S} \|L\|_* + \|S\|_1 \quad \text{s.t. } L + S = X$$

as a surrogate for the actual problem

$$L^*, S^* = \arg \min_{L, S} rank(L) + \|S\|_0 \quad \text{s.t. } L + S = X,$$

To induce structured-sparsity in $S$ we can use the $\ell_{2,1}$ or $\ell_{\infty}$-norm. So the problem becomes:

$$L^*, S^* = \arg \min_{L, S} \|L\|_* + \|S\|_{2,1} \quad \text{s.t. } L + S = X$$

or

$$L^*, S^* = \arg \min_{L, S} \|L\|_* + \|S\|_{\infty} \quad \text{s.t. } L + S = X$$