SDP formulation of noisy low rank matrix completion (2)

904 Views Asked by At

Thank you, Michael, for answering my previous question. It seems that I overlooked the problems in my initial question. I didn't recognize the SDP (in either primal or dual form) even for the 2-norm constrained problem. Now, I have a better understanding, I have another question.


For exact low rank matrix completion the corresponding SDP is:

$\text{minimize}_Y~\text{trace}(Y)$ subject to $\langle\left[ \begin{array}{cc} 0 & P_{ij} \\ P_{ij}^T & 0 \end{array} \right], Y \rangle = 2M_{ij}, \forall (i,j)\in\Omega$ and $Y \succeq 0$

where $P_{ij} = e_ie_j^T \in \mathbb{R}^{m \times n}$ and $Y$ is the matrix $\left[ \begin{array}{cc} W_1 & X \\ X^T & W_2 \end{array} \right]$ in the original question. This problem has $|\Omega|$ equality constraints.


Now, we consider the noisy matrix completion problem with 2-norm constraint. I think, one SDP formulation is as follows: Let $w$ and $x$ are the vectorized versions of $(W_1,W_2)$ and $X$ and $c$ is the appropriate binary mask vector to get the diagonal entries of $W_1$ and $W_2$.

$\text{minimize}_{w,x}~~c^T\left[\begin{array}{c} w \\ x \end{array} \right]$ subject to

$\displaystyle\sum_{ij\in\Omega}x_{ij} F_{ij} + \displaystyle\sum_{ij\notin\Omega}x_{ij} G_{ij} + \sum_{\begin{array}{c} i\le j \\ ~1\le i,j \le m \end{array}}w^{(1)}_{ij}H_{ij} + \sum_{\begin{array}{c} i\le j \\ ~1\le i,j \le n \end{array}}w^{(2)}_{ij}K_{ij} - E \succeq 0$

where

$F_{ij} = \left[ \begin{array}{c|c} \begin{array}{cc} 0 & P_{ij} \\ P_{ij}^T & 0 \end{array} & 0 \\ \hline 0 & \begin{array}{cc} 0 & P_{ij} \\ P_{ij}^T & 0 \end{array}\\ \end{array} \right]$,

$G_{ij} = \left[ \begin{array}{c|c} 0 & 0 \\ \hline 0 & \begin{array}{cc} 0 & P_{ij} \\ P_{ij}^T & 0 \end{array}\\ \end{array} \right]$;

$H_{ij} = \left[ \begin{array}{cc} Q_{ij}+Q_{ji} & 0 \\ 0 & 0 \end{array} \right]$, $Q_{ij}$ is a $m \times m$ matrix with 1 at $(i,j)$ and 0 everywhere else;

$K_{ij}$ is defined similarly to $H_{ij}$

and $E = \left[ \begin{array}{c|c} \begin{array}{cc} -\epsilon I & \mathcal{P}_\Omega(M) \\ \mathcal{P}_\Omega(M)^T & -\epsilon I \end{array} & 0 \\ \hline 0 & 0 \end{array} \right]$.

The Frobenius-norm constrained problem can be formulated similarly by stacking.

My formulation above (if correct) seems to be quite naive and taking its dual form results in many equality constraints (~$(m+n)^2$).

While having more constraints/variables seems to be unavoidable for noisy problems, I am wondering if there is any better formulation with fewer equality constraints.

Thanks.

1

There are 1 best solutions below

1
On

At some point, you have to decide when you make the leap from the theoretical to the practical.

As long as you're staying theoretical, it really doesn't matter how many equality constraints you have. Sure, it's $O((m+n)^2)$, but the point of the theoretical exercise is to demonstrate that the problem is representable by semidefinite programming. It frankly matters little how you go about it if the end result is a combination of linear matrix inequalities (LMIs), semidefinite-constrained variables, and linear equations.

Now, if you're interested in practical solutions, then you have to broaden your horizons a bit in a couple of different ways. First of all, you have to take note that there is not one "standard" form for semidefinite programming. Even if you limit yourself to totally pure semidefinite programming, there are two standard forms commonly considered: an equality-constrained form and an inequality form: $$\begin{array}{llcll} \text{minimize} & \langle C,X \rangle & \quad & \text{maximize} & b^T y \\ \text{subject to} & \mathcal{A}(X) = b & \quad & \text{subject to} & \mathcal{A}^*(y) + Z = C \\ & X \succeq 0 & \quad && Z \succeq 0 \end{array}$$ As hinted by the way I've written them, these are duals of each other. The important point is that any given model might be a better "fit" to one of these forms or the other, and it doesn't really matter. The complexity of solution is a function of the size of the semidefinite variables $X/Z$ and the number of rows of the linear operator $\mathcal{A}$. (Nor does it matter if you want to minimize and this form says maximize, or vice versa; just negate your objective vector.)

But these aren't the only standard forms considered. More practical choices, including both those considered in theoretical literature and those implemented in software, allow you to express at least some of the structure present in your model. For instance, it is quite common for $X/Z$ to have block diagonal structure; in the extreme, some of the blocks are 1x1, representing linear inequalities. Second-order cone inequalities (e.g., the Frobenius norm, the vector 2-norm) are much more efficient if handled natively (as in $O(n^2)\rightarrow O(n)$ savings).

In practice, it is very rare to encounter a model that closely fits one pure semidefinite standard forms listed above. So you should not be worrying at all about how efficient it is to represent a model in those forms. Instead, start looking at practical convex optimization solvers and the standard forms they employ, and see how your problem maps to those standard forms.

Or even better, don't do that, and let a modeling framework like CVX (disclosure: this is my software), YALMIP, or MOSEK's Fusion modeling system do the worrying instead. Let them decide how best to fit your model into their standard form, and even to decide whether the primal or dual form is better-suited for your model.