What's the Advantage of Difference of Convex Programming Compared to the Gradient Projection Method?

2.1k Views Asked by At

Consider the following problem ($P_1$)

$$(P_1)\;\;\; \min_{\mathbf{x}\in\mathbb{R}^n}f_1(\mathbf{x})-f_2(\mathbf{x})\\ s.t. \mathbf{A}\mathbf{x}=\mathbf{b},\\ 0\le \mathbf{x}\le 1, $$where $f_1(\mathbf{x})$ and $f_2(\mathbf{x})$ are two continuously differentiable and convex functions.

According to Thomas Lipp, Stephen Boyd - Variations and Extensions of the Convex-Concave Procedure, $(P_1)$ can be solved by the DC (difference-of-convex) programming to obtain a stationary point. However, ($P_1$) can also be solved by the gradient projection method (GPM).

My question is: when solving $(P_1)$, is DC more efficient than GPM? What's the advantages of the DC, comparing with GPM?

2

There are 2 best solutions below

0
On BEST ANSWER

The CCP procedure can be applied to a DC programming problem in cases where the convex functions are non-smooth.

Gradient descent can't be applied to DC programming problems in cases where the convex functions are non-smooth because $f_{1}(x)-f_{2}(x)$ won't generally be smooth.

0
On

I know maybe it is too late to answer this question.

In my humble opinion, the Gradient Projection Method (GPM) can be view as a special case of DC programming to some extent.

Let us consider the following optimization problem: \begin{equation} \begin{array}{cl} {\min} & {f(x)} \\ {\text{s.t.}} & {x \in C} \end{array} \end{equation} where we did not necessarily need $f$ is convex. Obviously this kind of problem include OP's problem ($P_1$).

To solve this problem, we can use DC programming, for details we construct two convex function, i.e., \begin{equation} g(x) = \frac{\lambda}{2} \|x\|^2, \quad h(x) = \frac{\lambda}{2} \|x\|^2 - f(x). \end{equation} And $\lambda$ should be large enough to ensure that function $h$ is convex ($\lambda I \succeq \nabla^2 f(x)$).

Through DC programming, we know that \begin{equation} \begin{aligned} \bar{x} &= \arg\min_{x} \frac{\lambda}{2} \|x\|^2 - (x-x^k)^T \nabla h(x^k) \\ &= x^k - \frac{1}{\lambda} \nabla f(x) \\ x^{k+1} &= \operatorname{Proj}_{C} (\bar{x}^{k}) \end{aligned} \end{equation} This is actually the GPM. This DC decomposition is very common, one can refer to DC Programming and DCA - Theory, Algorithms and Applications.