Non-smooth convex optimisation

117 Views Asked by At

I am in need of a suitable algorithm for optimisation problems of the form $$\textrm{Minimise} \;\;\; f(x) + g(x) \;\;\;\textrm{subject to} \;\;\; Ax = 0.$$ The function $g$ is non-smooth, but it is convex with Legendre dual $g^*$ being an indicator function for a convex set. The function $f$ is convex and differentiable, but nothing beyond that (the dual $f^*$ is not nice).

The Lagrangian for the system is $$\mathcal{L} = f(x) + g(x) + z^{T}Ax = \sup_{y} \{ f(x) -g^*(y) + x^{T} (y + A^{T}z) \}\qquad \qquad (1)$$ where $z$ is the Lagrange multiplier for the $Ax=0$ constraint, and we have used the expression relating $g$ to its dual $g^*$, for which we introduce the dual variable $y$. Due to the properties of the functions involved, the final expression for $\mathcal{L}$ in equation (1) is really the nicest/most useful way of expressing the problem. I am looking for a convergent algorithm to obtain the saddle point of this Lagrangian.

I have thought trying to adapt approaches from related problems:

  1. If the dual of $f$ was sensible, then we could write $$\mathcal{L} = \sup_{y_1,y_2} \{ -f^*(y_1) -g^*(y_2) + x^{T} (y_1 + y_2 + A^{T}z) \},$$ In this case, a good approach would have been to use an Augmented Lagrangian approach -- the original state $x$ acts like a Lagrange multiplier for the constraint $y_1 + y_2 + A^{T}z=0$ in the above formulation. However, returning to equation (1), the state $x$ does not appear as a Lagrange multiplier, so this approach is not appropriate (by taking a derivative of the Lagrangian w.r.t. $x$, it is clearly not a good idea to penalise the term $y + A^{T}z$ as it should not be zero).
  2. If the constraint $Ax=0$ was not there, then we could simply apply the Chambolle--Pock 2011 algorithm (https://hal.archives-ouvertes.fr/hal-00490826/document). I was thinking that this could simply be extended with a gradient ascent step for $z$ after every round of the Chambolle-Pock update (I am still currently trying this out), but don't know whether this approach is convergent.

I am unsure whether there is a more appropriate method that has been applied to this problem - I haven't found any literature on optimisation problems with similar issues as the one above.

Cheers for any comments or help.

EDIT: I have tried introducing an additional dummy variable $w$ and dual $q$ and writing the Lagrangian instead as $$\mathcal{L} = f(x) + g(x) + z^{T}Ax = \sup_{y,q} \inf_w \{ f(w) -g^*(y) + x^{T} (y + A^{T}z) + q^{T}(x-w) \}$$ in which the original state variable now acts as a Lagrange multiplier on the constraint $y + A^{T}z + q = 0$. I have proceeded with an Augmented Lagrangian approach (ALG2 from Michel Fortin and Roland Glowinski. Augmented Lagrangian methods) with $$\mathcal{L}_r = f(w) - g^*(y) + x^{T} (y + A^{T}z + q) - w^{T} q +\frac{r}{2}\|y + A^{T}z + q\|_2^2$$ however, this appears to not be convergent.

2

There are 2 best solutions below

3
On

I assume you're familiar with proximal methods like that by Chambolle-Pock, Vu-Condat, and Combettes et al. So you should checkout an extension called "three-operator splitting" https://arxiv.org/pdf/1504.01032.pdf by Davis and Yin. Your problem is an instance of problems (2.4), (2.5), and (2.11), for which explicity algorithms are presented and analysed.

Also as Royi said in the comments, providing more information about the functions involved in your problem can only help to simplify things further...

0
On

I see three ways for you to proceed. There is generalized forward-backward splitting, where you would lift to a product space and then do more or less forward-backward. Then there is forward Douglas Rachford which is very similar to GFB here if I am not mistaken. Lastly there is three operator splitting as suggested already.