Orthogonal Projection onto the Intersection of Convex Sets

4k Views Asked by At

The general method of Projection on Convex Sets (POCS) can be used to find a point in the intersection of a number of convex sets i.e.

$$ \text{find } x \in \mathbb{R}^N \text{ s.t. } x \in \bigcap_i C_i $$

This method can find any feasible point in the intersection of the convex sets. Now my question is: Is there a similar method that can find a point $x^*$ that has minimum norm instead i.e. solve

$$ x^* = \arg \min_{x \in \mathbb{R}^N} \Vert x\Vert \text{ s.t. } x \in \bigcap_i C_i$$

or more generally find the projection of a point $a \in \mathbb{R}^N$ onto the intersection of the convex sets i.e. solve

$$ x^* = \arg \min_{x \in \mathbb{R}^N} \Vert x - a\Vert \text{ s.t. } x \in \bigcap_i C_i$$ for some $a \in \mathbb{R}^N$?

Edit

The problem we are trying to solve is a 3D tomography reconstruction problem, and thus the variable $x$ takes gigabytes of RAM. There is already a POCS algorithm (A variant of Algebraic Reconstruction Technique (ART)) that finds a feasible point. So is there a way to use it as a black-box or adapt it to find a minimum-norm solution instead?

3

There are 3 best solutions below

18
On

This can be done fairly easily using proximal algorithms. Let $\delta_i$ be the indicator function of $C_i$: \begin{equation} \delta_i(x) = \begin{cases} 0 & \text{if } x \in C_i, \\ \infty & \text{otherwise.} \end{cases} \end{equation} Your optimization problem can be written as \begin{equation} \text{minimize} \quad \| x - a \| + \sum_i \delta_i(x). \end{equation} The objective function is a sum of functions that have easy proximal operators. Thus, you can use the Douglas-Rachford method (together with the consensus trick) to solve this optimization problem.

The Douglas-Rachford method is an iterative method for minimizing the sum of two convex functions, each of which has an easy proximal operator. Since we have a sum of more than two functions, it may seem that Douglas-Rachford does not apply. However, we can get around this by using the consensus trick. We reformulate our problem as

\begin{align*} \text{minimize} & \quad \underbrace{\|x_0 - a \| + \sum_i \delta_i(x_i)}_{f(x_0,x_1,\ldots,x_m)} + \underbrace{\delta_S(x_0,x_1,\ldots,x_m)}_{g(x_0,x_1,\ldots,x_m)} \end{align*} where \begin{equation} S = \{(x_0,x_1,\ldots,x_m) \mid x_0 = x_1 = \cdots = x_m \} \end{equation} and $\delta_S$ is the indicator function of $S$. The variables in this reformulated problem are the vectors $x_0,x_1,\ldots,x_m$. The indicator function $\delta_S$ is being used to enforce the constraint that all these vectors should be equal. We are now minimizing a sum of two convex functions, $f$ and $g$, which have easy prox-operators because $f$ is a separable sum of easy functions and $g$ is the indicator function of a set that we can project onto easily. So we are now able to apply the Douglas-Rachford method.

The Douglas-Rachford iteration is just three lines, and the code for this problem could probably be written in about a page of Matlab (unless projecting onto the sets $C_i$ is very complicated).

4
On

I found an algorithm called Hybrid Steepest Descent Method (Explained in detail in the paper The Hybrid Steepest Descent Method for the Variational Inequality Problem Over the Intersection of Fixed Point Sets of Non Expansive Mappings) that can solve problems like

$$x^* = \arg \min_x f(x) \text{ s.t. } x \in \cap_i C_i$$

by iterating

$$ \begin{align} y^t & = (1-\beta)x^{t}+\beta\sum_{i}w_{i}P_{C_{i}}(x^{t}) \\ x^{t+1} & = y^{t}-\lambda_{t+1}\mu \nabla f(y^t) \end{align} $$

for a small enough $\mu>0$, $\lambda_t = 1/t $, $0<\beta\le 2$, $P_{C_i}$ is the projection onto set $C_i$, $w_i \in (0,1]$ s.t. $\sum_i w_i = 1$, and $\nabla f$ is the gradient of $f$.

So to solve the problem

$$x^* = \arg \min_x \Vert x-a \Vert^2 \text{ s.t. } x \in \cap_i C_i $$ we simply need to substitute $$f(x)=\frac{1}{2} \Vert x-a\Vert^2 \rightarrow \nabla f(x)=x-a$$

0
On

ADMM Consensus Trick and Orthogonal Projection onto an Intersection of Convex Sets

The ADMM Consensus Optimization

Given a problem in the form:

$$\begin{aligned} \arg \min_{ \boldsymbol{x} } \quad & \sum_{i}^{n} {f}_{i} \left( \boldsymbol{x} \right) \\ \end{aligned}$$

By invoking auxiliary variables one could rewrite it as:

$$\begin{aligned} \arg \min_{ \boldsymbol{x} } \quad & \sum_{i}^{n} {f}_{i} \left( \boldsymbol{x}_{i} \right) \\ \text{subject to} \quad & \boldsymbol{x}_{i} = \boldsymbol{z}, \; i = 1, 2, \ldots, n \\ \end{aligned}$$

Namely, it is a separable form with equality constraint on each variable.
In matrix form it can be written as:

$$\begin{aligned} \arg \min_{ \boldsymbol{x} } \quad & \sum_{i}^{n} {f}_{i} \left( \boldsymbol{x}_{i} \right) \\ \text{subject to} \quad & \boldsymbol{u} := \begin{bmatrix} \boldsymbol{x}_{1} \\ \boldsymbol{x}_{2} \\ \vdots \\ \boldsymbol{x}_{n} \end{bmatrix} = \begin{bmatrix} I \\ I \\ \vdots \\ I \end{bmatrix} \boldsymbol{z} \end{aligned}$$

By defining $ {E}_{i} $ to be the matrix such that $ \boldsymbol{x}_{i} = {E}_{i} \boldsymbol{u} $, namely the selector of the appropriate sub vector form $ \boldsymbol{u} $ we can write the problem in the ADMM form (The Scaled ADMM form):

$$\begin{aligned} \boldsymbol{u}^{k + 1} & = \arg \min_{ \boldsymbol{u} } \sum_{i}^{n} {f}_{i} \left( {E}_{i} \boldsymbol{u} \right) + \frac{\rho}{2} {\left\| {E}_{i} \boldsymbol{u} - \boldsymbol{z}^{k} + \boldsymbol{\lambda}^{k} \right\|}_{2}^{2} \\ \boldsymbol{z}^{k + 1} & = \arg \min_{ \boldsymbol{z} } \frac{\rho}{2} {\left\| \boldsymbol{u}^{k + 1} - \begin{bmatrix} I \\ I \\ \vdots \\ I \end{bmatrix} \boldsymbol{z} + \boldsymbol{\lambda}^{k} \right\|}_{2}^{2} \\ \boldsymbol{\lambda}^{k + 1} & = \boldsymbol{\lambda}^{k} + \boldsymbol{u}^{k + 1} - \begin{bmatrix} I \\ I \\ \vdots \\ I \end{bmatrix} \boldsymbol{z}^{k + 1} \\ \end{aligned}$$

Since the form is block separable it can be written in an element form:

$$\begin{aligned} \boldsymbol{x}_{i}^{k + 1} & = \arg \min_{ \boldsymbol{x}_{i} } {f}_{i} \left( \boldsymbol{x}_{i} \right) + \frac{\rho}{2} {\left\| \boldsymbol{x}_{i} - \boldsymbol{z}^{k} + \boldsymbol{\lambda}_{i}^{k} \right\|}_{2}^{2}, & i = 1, 2, \ldots, n \\ \boldsymbol{z}^{k + 1} & = \arg \min_{ \boldsymbol{z} } \frac{\rho}{2} \sum_{i = 1}^{n} {\left\| \boldsymbol{x}_{i}^{k + 1} - \boldsymbol{z} + \boldsymbol{\lambda}_{i}^{k} \right\|}_{2}^{2} \\ \boldsymbol{\lambda}_{i}^{k + 1} & = \boldsymbol{\lambda}_{i}^{k} + \boldsymbol{x}_{i}^{k + 1} - \boldsymbol{z}^{k + 1}, & i = 1, 2, \ldots, n \\ \end{aligned}$$

Remark: Pay attention that $ \boldsymbol{z}_{k + 1} = \frac{1}{n} \sum_{i = 1}^{n} \boldsymbol{x}_{i}^{k + 1} + \boldsymbol{\lambda}_{i}^{k} $. Namely the mean value of the set $ { \left\{ \boldsymbol{x}_{i}^{k + 1} + \boldsymbol{\lambda}_{i}^{k} \right\} }_{i = 1}^{n} $.

The Proximal Operator is given by $ \operatorname{Prox}_{\lambda f \left( \cdot \right)} \left( \boldsymbol{y} \right) = \arg \min_{\boldsymbol{x}} \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} + \lambda f \left( \boldsymbol{x} \right) $ and simplifying the optimization for $ \boldsymbol{z}_{k + 1} $ one can write the above as:

$$\begin{aligned} \boldsymbol{x}_{i}^{k + 1} & = \operatorname{Prox}_{\frac{1}{\rho} {f}_{i} \left( \cdot \right)} \left( \boldsymbol{z}^{k} - \boldsymbol{\lambda}_{i}^{k} \right), & i = 1, 2, \ldots, n \\ \boldsymbol{z}^{k + 1} & = \frac{1}{n} \sum_{i = 1}^{n} \boldsymbol{x}_{i}^{k + 1} + \boldsymbol{\lambda}_{i}^{k} \\ \boldsymbol{\lambda}_{i}^{k + 1} & = \boldsymbol{\lambda}_{i}^{k} + \boldsymbol{x}_{i}^{k + 1} - \boldsymbol{z}^{k + 1}, & i = 1, 2, \ldots, n \\ \end{aligned}$$

Orthogonal Projection onto an Intersection of Convex Sets

The problem is given by:

$$\begin{aligned} \arg \min_{ \boldsymbol{x} } \quad & \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} \\ \text{subject to} \quad & \boldsymbol{x} \in \bigcap_i \mathcal{C}_{i}, \; i = 1, 2, \ldots n \\ \end{aligned}$$

Namely we're looking for orthogonal projection of $ \boldsymbol{y} $ onto the intersection of the sets $ {\left\{ \mathcal{C}_{i} \right\}}_{i = 1}^{n} $.

The problem could be rewritten as:

$$\begin{aligned} \arg \min_{ \boldsymbol{x} } \quad & \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} + \sum_{i}^{n} {\delta}_{\mathcal{C}_{i}} \left( \boldsymbol{x} \right) \end{aligned}$$

Where $ {\delta}_{\mathcal{C}_{i}} (x) = \begin{cases} 0 & x \in \mathcal{C}_{i} \\ \infty & x \notin \mathcal{C}_{i} \end{cases} $.

So we can use the ADMM Consensus Optimization by setting $ {f}_{1} \left( \boldsymbol{x} \right) = \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} $ and $ {f}_{i} \left( \boldsymbol{x} \right) = {\delta}_{\mathcal{C}_{i - 1}} \left( \boldsymbol{x} \right), \; i = 2, 3, \ldots, n + 1 $.

In order to solve it we need the Proximal Operator of each function. For $ {f}_{1} \left( \cdot \right) $ it is given by:

$$ \operatorname{Prox}_{ \frac{1}{\rho} {f}_{1} \left( \cdot \right)} \left( \boldsymbol{v} \right) = \arg \min_{ \boldsymbol{x} } \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} + \frac{\rho}{2} {\left\| \boldsymbol{x} - \boldsymbol{v} \right\|}_{2}^{2} = \frac{\rho \boldsymbol{v} + \boldsymbol{y}}{1 + \rho} $$

For the rest the Proximal Operator is the orthogonal projection:

$$ \operatorname{Prox}_{ \frac{1}{\rho} {f}_{i} \left( \cdot \right)} \left( \boldsymbol{v} \right) = \operatorname{Proj}_{ \mathcal{C}_{i - 1} } \left( \boldsymbol{v} \right), \; i = 2, 3, \ldots, n + 1 $$