How to divide the original optimization problem into several suproblems?

1.1k Views Asked by At

I have follow optimization

$$\begin{array}{ll} \text{minimize} & \| \mathbf{X} - \mathbf{A} \|_F^2\\ \text{subject to} & {\mathbf{X}}\mathbb{1} = \mathbb{1}, \quad \mathbf{X} \geq 0\\ & \mathbf{X} = \mathbf{X}^T, \quad Tr(\mathbf{X}) = k \end{array}$$

in which $\mathbf{X},\mathbf{M} $ is the matrix. $\mathbb{1}$ is the vector consist all ones. $Tr()$ is the matrix trace.

I saw the paper (their Eq.(19)): http://www.kdd.org/kdd2016/papers/files/rfp1162-wangA.pdf

They solve above problem in two subproblems

$$\begin{array}{ll} \text{minimize} & \| \mathbf{X} - \mathbf{A} \|_F^2\\ \text{subject to} & {\mathbf{X}}\mathbb{1} = \mathbb{1}, \quad \mathbf{X} \geq 0 \end{array}$$ and

$$\begin{array}{ll} \text{minimize} & \| \mathbf{X} - \mathbf{A} \|_F^2\\ \text{subject to} & \mathbf{X} = \mathbf{X}^T, \quad Tr(\mathbf{X}) = k \end{array}$$

My concerns are belows

(1) Why it is correct ? What is the logical behind this decompose the original optimization into two subproblems?

(2) Finally, How to get the final solution $X$ from those two subproblems?

2

There are 2 best solutions below

5
On

It is explained on p. 4 of the paper linked in the question. The strategy is to iteratively solve the 2 subproblems, which are each easier to solve than the original problem. When the subproblems have converged to a value of $X$, that is the solution (i.e., $X$ wouldn't further change by more than a small convergence tolerance if further iterations were taken).

Our strategy is to solve two subproblems, Problem (20) and Problem (21) alternately, and let their solutions project mutually. In each iteration, we solve Problem (20) first and let its solution M 1 to be the T matrix in Problem (21), afterwards we solve Problem (21) and let its solution M 2 play the role of matrix T in Problem (20). We solve these two problems alternately and iteratively until M converges.

According to Von Neumann’s successive projection lemma [16], this mutual projection strategy we use will converge to the cross of two subspaces formed by Problems (20) and (21). The lemma theoretically ensures that the solution of the alternate projection strategy ultimately converges onto the global optimal solution of Problem (19)

2
On

Maybe I will give an explanation from a different viewpoint. A projection of $a$ onto the set $S$ is the ``closest point'' to $a$, that is, it is the solution of $$ \min_{x} \|x - a\| \text{ subject to } x \in S $$

Now, suppose you have two sets $S_1$ and $S_2$ for which you have good projection algorithms. A well-known method for projecting onto $S = S_1 \cap S_2$ is alternatively projecting onto $S_1$ and then $S_2$ (and then $S_1$ and so on...), until convergence. That is, we generate the sequence $$ X_{k+1} = P_{S_2}(P_{S_1}(X_k)) $$ It is well known (Newmann's alternating projection Lemma) that if both sets are convex, the algorithm converges to the projection onto $S$.

Your given problem is indeed a projection of $A$ onto the set defined by the constraints. The authors claim that they have an efficient algorithm for projecting onto the set $S_1$ defined by the first two constraints, and onto the set $S_2$ defined by the second two constraints. What they suggest is simply using the alternating projection method. Nothing more, nothing less.