I have a sparse matrix factorization problem, where I want to decompose a matrix $X\in \mathbb{R}^{n\times m}$ to $A\in \mathbb{R}^{n\times p}$ and $B\in \mathbb{R}^{p\times m}$, such that $X\approx AB$.
My matrix $X$ might have outlier values and missing values (that are replaced by zero). While I am aware of robust PCA, and low-rank plus sparse decomposition (E.g. in Candès et. al 2009-2016), I do not want to address the outliers directly.
I want to solve the following optimization program instead:
$$\text{minimize}_{A,B\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)\|X-AB\|_1+\gamma_1\|A\|_1+\gamma_2\|B\|_1+\lambda\|A\|_*$$
$\|A\|_*$ is the nuclear norm of $A$, and it tries to encourage $A$ to be low-rank. Since this program is not jointly convex in $A$ and $B$, I want to solve it by alternating between the two, so let's assume I just want to solve it for $A$:
$$\text{minimize}_{A\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)\|X-AB\|_1+\gamma_1\|A\|_1+\lambda\|A\|_*$$
I have the following questions:
Can I use a sequence of proximal projections for iteratively solving this problem? If yes, what should be the sequence of applying the projections?
If not, what is the best approach? I have tried subgradient descent, and it is not giving me a satisfactory convergence rate.
I would like to avoid off-the-shelf solvers, but if there is too much hassle to implement this myself (e.g. in Python's Scipy), are there any free solvers out there that scale to $n\approx10^9$ and $m\approx 10^4$?
Thanks!
I ended up doing the following, so I thought maybe I should post it here too.
We can also formulate this problem as a cone program, some thing like:
$\text{minimize}_{A\geq 0} ~~\beta\|X-AB\|_F^2+(1-\beta)t_1+\gamma_1 t_2+\lambda \|A\|_*$
$\text{subject to}$
$\quad\|X-AB\|_1\leq t_1$
$\quad\|A\|_1\leq t_2$
Since we know that the consraints $\|X-AB\|_1\leq t_1$ and $\|A\|_1\leq t_2$ can be replaced by affine inequalities, and since we have the following property. We can solve the problem using the interior point method. $$\begin{array}{ll} \text{minimize} & \|X\|_* \\ \text{subject to} & \mathcal{A}(X)=b \end{array} \quad\Longleftrightarrow\quad \begin{array}{ll} \text{minimize} & \tfrac{1}{2}\left( \mathop{\textrm{Tr}}W_1 + \mathop{\textrm{Tr}}W_2 \right) \\ \text{subject to} & \begin{bmatrix} W_1 & X \\ X^T & W_2 \end{bmatrix} \succeq 0 \\ & \mathcal{A}(X)=b \end{array} $$
Ideas are welcome!