How to prove an expression of log-sum-exp type is convex in DCP?

1.7k Views Asked by At

In Disciplined Convex Programming (DCP) the user should give a certificate to the program that the function (sometimes multivariate) is convex. The user should use predefined set of functions like

  • $x^2/y$
  • $(x_1 \ldots x_k)^{1/k}$
  • $\log (e^{x_1} + \ldots + e^{x_k})$
  • $-\log \det X $
  • the list a bit is longer than that, but not "extensive"
  • and their superpositions.

However, despite appealing tree-like structure, DCP rules are designed not for the user's convenience, but rather because they are needed to construct corresponding barriers for interior-point optimisation method. There are no general-purpose barrier constructors for arbitrary convex functions, and it couldn't be expected that every convex function can be expressed through the rules. DCP is a good alternative to composing the barriers by hands and doing manually cumbersome reformulations of the initial optimisation program. Good reference on the topic is Michael Grant's PhD Dissertation. He also mentioned a useful FAQ for MATLAB users but also applicable for python.

Consider as an example multivariate function $$ f(x_1, \ldots, x_d) = \dfrac{1}{1 - e^{x_1}} \cdot \dfrac{1}{1 - e^{x_2}} \cdots \dfrac{1}{1 - e^{x_d}} - 1 $$ with domain $ x_i < 0, i = \overline{1,d} $. I want to express multivariate convex function $$ \varphi(x_1, \ldots, x_d) := \log f(x_1, \ldots, x_d) $$ using DCP set of rules.

P.S. A more general question is how to prove convexity of the expressions in DCP in more general cases. Maybe there is some tutorial or "search algorithm" about this. I am not aware of any references of the kind, and Michael Grant is neither.

P.P.S. I added the tag python because my main concern is with the set of cvxpy disciplined convex functions which may not necessarily coincide with the set above. MATLAB also has similar system called cvx, the rules may slightly differ.

UPD: I edited the question because initially it was formulated for particular case $ x_1 = \ldots = x_d = x $. Current formulation is more general.

1

There are 1 best solutions below

0
On BEST ANSWER

Let me first show that $ f(x_1, \ldots, x_d) $ is convex. Indeed, if we represent the function as formal power series centered at $ (x_1, \ldots, x_d) = (0, \ldots, 0) $ then we will obtain something of a shape $$ \log \sum_{(i_1, \ldots, i_d)} a_{i_1 \ldots i_d} e^{i_1 x_1 + \ldots + i_d x_d} $$ with nonnegative coefficients $ a_{i_1, \ldots, i_d} $, which is log-sum-exp and this is a "table" convex function (but with exponentially many coefficients and we don't want this).

I assume that all convex functions are only involved in right-hand side of inequalities

CONCAVE >= CONVEX,

this will be used while introducing new slack variables and inequalities.

After variable change $ \dfrac{1}{1 - e^{x_i}} = 1 + e^{s_i} $ which can be expressed using slack inequalities $$ s_i \geq x_i + \log( 1 + e^{s_i}), \quad i = \overline{1, d} $$ we can reduce the problem to give DCP rules for constructing $$ h(s_1, \ldots, s_d) = \log \Big( (1+e^{s_1})\ldots (1 + e^{s_d}) - 1 \Big) \enspace . $$

The problem can be solved by introducing $ O(d^2) $ slack variables and inequalities. $$ \begin{cases} h &\geq \log(e^{p_{1,1}} + e^{p_{2,1}} + \ldots + e^{p_{d,1}} ),\\ p_{k,j} &\geq \log(e^{q_{k,j}} + e^{p_{k,j}}), \quad k=\overline{1,d},\ j \leq d-k+1, \\ q_{1,j} &= s_j, \quad j=\overline{1,d} ,\\ q_{k+1,j} &= p_{k,j} + q_{k,j+1} , \quad k = \overline{1,d-1},\ j \leq d-k+1 \end{cases} $$

The expressions $p_{1,1}, p_{2,1}, \ldots, p_{d,1}$ play the role of symmetric polynomials $$ \begin{cases} \sigma_1 := e^{s_1} + e^{s_2} + \ldots + e^{s_d},\\ \sigma_2 := e^{s_1+s_2} + e^{s_1+s_3} + \ldots + e^{s_{d-1} + s_d},\\ \ldots ,\\ \sigma_d := e^{s_1+s_2+\ldots +s_d} \end{cases} $$ They appear after expanding the brackets in $ h(s_1, \ldots, s_d) $: $$ h(s_1, \ldots, s_d) = \log(\sigma_1 + \sigma_2 + \ldots + \sigma_d) $$ The idea is to use quadratic algorithm for computing symmetric polynomials (instead of exponential time trivial explicit writing of summands). We maintain one 2D-array of parts of symmetric polynomials (upper-triangular matrix): $$ \begin{array}{rl} p_{1,j} &= [ x_1,\ x_2,\ \ldots,\ x_d ],\\ p_{2,j} &= [x_{d-1} \cdot x_d,\ x_{d-2} \cdot (x_{d-1} +x_d),\ \ldots,\ x_1 \cdot (x_2 + \ldots + x_d)], \\ p_{3,j} &= [ x_{d-2} \cdot x_{d-1} \cdot x_d,\ \ldots, x_1 (x_2 x_3 + \ldots + x_{d-1} x_d)], \\ &\ldots,\\ p_{d,d} &= [ x_1 x_2 \ldots x_d]. \end{array} $$ and the 2D-array of partial sums $q_{k,j} := q_{k,j-1} + p_{k,j}$. The arrays $(p_{i,j})_{ij}$ and $(q_{i,j})_{ij}$ are recursively computed through one another.