Why reformulate composite optimization using equality constraints?

76 Views Asked by At

On the “Practical Optimization” section in Mosek’s documentation, it recommends reformulating composite functions $f(g(x))$ where $f$ is convex by moving the inner $g(x)$ into a new equality constraint by introducing a new variable.

Moreover, in this work on composite function minimization, the authors convert an unconstrained composite optimization problem into one with equality constraints using the same reformulation above.

Finally, we often consider problems with equality constraints \begin{equation} \min_{x, r} f(r) \quad \text{s.t.}\quad r=g(x) \end{equation} when the equality constraints could be “folded inside” the objective function.

Is there a reason why these types of reformulations are done for composite functions? Is there any work that has any analysis on the benefits of doing this reformulation? Does this change depending on whether $g$ Has additional structure? (i.e., convex/concave)

1

There are 1 best solutions below

0
On

You don't move it to equalities (unless it is affine function) but to inequalities (speaking about the first case)

The reason is that you must write everything using standard conic operators. An objective $e^{x^2}$ is not directly possible to give to the solver, but by replacing the objective by $t$ and introducing the constraints $t\geq e^s$ and $s\geq x^2$ you have a model with standard conic representable constraints.

In the second case, it is not necessarily that we always do like that, but by doing it you might be able to develop and say more about an algorithm depending on various properties of the functions.

A computational reason might be that the decomposition allows you to simplify gradients and Hessians, essentially trading sparsity and structure for more variables. For instance $(\sum_N x_i)^2$ has a dense $N\times N$ Hessian, but by introducing a single variable, the objective hessian is trivially sparse with a single element.