Add new contraints and keep convex optimization avoiding binary variables

206 Views Asked by At

I am working in a binary optimization problem, but I use soft-labels to find an approximated solution. So the optimization problem with soft-labels is the next one:

$$\begin{aligned}& \operatorname*{argmin}_{\mathbf{y}_u \in [0,1]^{U}} & \frac{1}{L} \left\| \mathbf{X} \left( \mathbf{X}_{e}^{T} \mathbf{X}_{e} \right)^{-1} \mathbf{X}_{e}^{T} \begin{bmatrix} \mathbf{y} \\ \mathbf{y}_u \end{bmatrix} - \mathbf{y} \right\|_2^2 \end{aligned}$$ where $\mathbf{X} \in \mathbb{R}^{L \times d}$, $\mathbf{X_e} \in \mathbb{R}^{(L+U) \times d}$, $\mathbf{y} \in \mathbb{R}^{L \times 1}$ and $\mathbf{y}_u \in \mathbb{R}^{U \times 1}$.

This problem is a convex optimization problem that can be written in a standard quadratic programing form and solved using a gradient descend procedure.

Once I have the soft-labeled solution $\mathbf{y}_u = \left[ y_1, \dots,y_U \right]$, I get the binary solution labeling as $0$ the $y_i<0.5$ and as $1$ the $y_i>0.5$ for each $i=1,\dots,U$.

Now, I am working in a two new cases of this problem where I have more constraints and the thing is that I don't know how to keep the problem as a convex optimization problem. Specially, I have had problems because in the two situations I have been forced to introduce binary variables but I want to avoid that if it is possible.

First case

In this case, we have $b$ independent bags and we know the proportions of zeros and ones in each bag. So for example, lets imagine the simplest case were I have only one bag and I know the proportion of zeros and ones is $1/3$ and $2/3$ respectively. The only way to add this restriction that I have found is to introduce the binary variables $w_i$ and $z_i$ and add the next constraints:

$$\sum_{i=1}^{U}w_i=\frac{2}{3}U$$

$$\sum_{i=1}^{U}z_i=\frac{1}{3}U$$

$$w_i+z_i=1$$

$$w_i-2y_i \leq 0$$

$$z_i-2(1-y_i) \leq 0$$

$$w_i,z_i \in \lbrace0,1\rbrace$$

Like $U$ is supposed to be a large number, I can admit to not fulfill exactly the proportions with some tolerance, but I want the problem to be a convex optimization problem.

Second case

In this other case, we have again $b$ independent bags and we know that in each bag all the $y_i$ have the same value (0 or 1) but we don't know the value. To illustrate, suppose again the simplest case where we only have one bag. I have tried using a penalty method that lead me to this optimization problem:

$$\begin{aligned}& \operatorname*{argmin}_{\mathbf{y}_u \in [0,1]^{U}} & \frac{1}{L} \left\| \mathbf{X} \left( \mathbf{X}_{e}^{T} \mathbf{X}_{e} \right)^{-1} \mathbf{X}_{e}^{T} \begin{bmatrix} \mathbf{y} \\ \mathbf{y}_u \end{bmatrix} - \mathbf{y} \right\|_2^2 \end{aligned} + \epsilon \, \xi$$

$$y_i \leq \left( 0.5 +\epsilon \right) \text{ for all } i=1,\dots,U$$

$$y_i \geq \left( 0.5 - \xi \right) \text{ for all } i=1,\dots,U$$

$$\epsilon + \xi =0.5$$

$$\epsilon, \xi \geq 0$$

In this case I think the problem is solved because the term $\epsilon \, \xi \rightarrow 0$ and for this $\epsilon \rightarrow 0$ while $\xi \rightarrow 0.5$ or $\xi \rightarrow 0$ while $\epsilon \rightarrow 0.5$

1

There are 1 best solutions below

1
On BEST ANSWER

The constraints in both cases are inherently non-convex, so it's unlikely that there's a convex formulation. In fact, though I believe your formulation in the second case to be conceptually correct, it is no longer convex. For an indication of why, you can investigate the Hessian of $f(x,\epsilon,\xi) = x^2 +\epsilon\xi$, which is a simpler objective in the same form as yours. The Hessian is not positive semi-definite.

To see the non-convexity of the constraints, take the simple case of U = 2 and, for the first case, a proportion of 1/2. The region in the plane which satisfies the proportion constraint consists of the union of the sets $[0,1/2]\times [1/2,1]$ and $[1/2,1]\times [0,1/2]$, shown here. This union is clearly non-convex. For the second type of constraints and U=2, the feasible region is a similar union, pictured here.

If you have to resort to binary variables, the cleanest way may be to make the $y_i$ themselves binary. Then the first constraint can be written as

$$ \sum_i y_i \leq \frac{2}{3}U +\epsilon $$

$$ \sum_i y_i \geq \frac{2}{3}U -\epsilon $$

for some small $\epsilon$. The second case can be modeled with

$$ y_1 = y_i $$ for all $i$.

CPLEX, Gurobi, and others are getting quite good at convex quadratic mixed integer optimization, though if you're in the big data regime the size of your problem may still be prohibitive.