Consecutive Proximal Projections as a Solution to Sum of Many Functions Each with Known Proximal Operator

Question

Consecutive Proximal Projections as a Solution to Sum of Many Functions Each with Known Proximal Operator

116 Views Asked by Bumbble Comm At 11 May 2026 - 12:47

I have a sparse matrix factorization problem, where I want to decompose a matrix $X\in \mathbb{R}^{n\times m}$ to $A\in \mathbb{R}^{n\times p}$ and $B\in \mathbb{R}^{p\times m}$, such that $X\approx AB$.

My matrix $X$ might have outlier values and missing values (that are replaced by zero). While I am aware of robust PCA, and low-rank plus sparse decomposition (E.g. in Candès et. al 2009-2016), I do not want to address the outliers directly.

I want to solve the following optimization program instead:

$$\text{minimize}_{A,B\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)\|X-AB\|_1+\gamma_1\|A\|_1+\gamma_2\|B\|_1+\lambda\|A\|_*$$

$\|A\|_*$ is the nuclear norm of $A$, and it tries to encourage $A$ to be low-rank. Since this program is not jointly convex in $A$ and $B$, I want to solve it by alternating between the two, so let's assume I just want to solve it for $A$:

$$\text{minimize}_{A\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)\|X-AB\|_1+\gamma_1\|A\|_1+\lambda\|A\|_*$$

I have the following questions:

Can I use a sequence of proximal projections for iteratively solving this problem? If yes, what should be the sequence of applying the projections?
If not, what is the best approach? I have tried subgradient descent, and it is not giving me a satisfactory convergence rate.
I would like to avoid off-the-shelf solvers, but if there is too much hassle to implement this myself (e.g. in Python's Scipy), are there any free solvers out there that scale to $n\approx10^9$ and $m\approx 10^4$?

Thanks!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 01 Feb 2017 - 9:22

The term "$\beta\|X-AB\|_{\text{F}}^2 + (1-\beta)\|X-AB\|_1$" really looks like bad modelling. What are you trying to capture with it ? Aren't you better off with just the term $\|X-AB\|_{\text{F}}^2$ ?
Sequence of proximal operators applied anyhow will certainly diverge
Subgradient descent has very bad complexity: $\mathcal O(1/\epsilon^2)$. You don't want to be doing this since this will only be a subproblem to solve at every step...
There are proximal forward-backward algorithms which can deal with multiple proximable functions. For example there is the Vu-Condat algorithm (see section 5).
Finally, your original problem though nonconvex nonsmooth, it's semi-algebraic and can be attacked directly using recent works from Jerome Bolte, Heddy Attouch, and Marc Teboulle. For example, see page 12 of these slides.

**Bumbble Comm** · Accepted Answer

I ended up doing the following, so I thought maybe I should post it here too.

We can also formulate this problem as a cone program, some thing like:

$\text{minimize}_{A\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)t_1+\gamma_1 t_2+\lambda \|A\|_*$

$\text{subject to}$

$\quad\|X-AB\|_1\leq t_1$

$\quad\|A\|_1\leq t_2$

Since we know that the consraints $\|X-AB\|_1\leq t_1$ and $\|A\|_1\leq t_2$ can be replaced by affine inequalities, and since we have the following property. We can solve the problem using the interior point method. $$\begin{array}{ll} \text{minimize} & \|X\|_* \\ \text{subject to} & \mathcal{A}(X)=b \end{array} \quad\Longleftrightarrow\quad \begin{array}{ll} \text{minimize} & \tfrac{1}{2}\left( \mathop{\textrm{Tr}}W_1 + \mathop{\textrm{Tr}}W_2 \right) \\ \text{subject to} & \begin{bmatrix} W_1 & X \\ X^T & W_2 \end{bmatrix} \succeq 0 \\ & \mathcal{A}(X)=b \end{array} $$

Ideas are welcome!

Consecutive Proximal Projections as a Solution to Sum of Many Functions Each with Known Proximal Operator

There are 2 best solutions below

Related Questions in CONVEX-OPTIMIZATION

Related Questions in CONVEX-CONE

Related Questions in SUBGRADIENT

Related Questions in PROXIMAL-OPERATORS

Related Questions in NUCLEAR-NORM

Trending Questions

Popular # Hahtags

Popular Questions