Maximum subset as a integer linear program

710 Views Asked by At

Ok so we have set of $n$ random numbers $a_1,a_2,...,a_n$ where $a_i \in \Re$ and and we want to find the subset with the maximum sum (example 3,-2.5, 2, 6, 1,-10 max subset is: $a_3,a_4,a_5$ etc).I want to create an integer linear program that describes this problem. My idea so far is this:

Objective function : $max\sum_1^nx_ia_i$

subject to: $x_i\in(0,1)$ ($x_i$ represents if $a_i$ will be part of the maximum subset)

but i simply can't find a way to create a constrain about numbers being consecutive ..

2

There are 2 best solutions below

0
On BEST ANSWER

I'm going to assume that you are looking for a selection of consecutive values. There are multiple ways to model this, including the following. Start with binary variables $s_i\in\{0,1\}$ and $e_i\in\{0,1\}$ for $i=1,\dots,n$. $s_i$ signals the start of the selection and $e_i$ signals the end. Add the constraints$$\sum_{i=1}^n s_i=1$$and$$\sum_{i=1}^n e_i=1$$to ensure a single start and end to the sequence. Next, mainly for convenience, we'll introduce some (continuous) auxiliary variables:$$y_i=\sum_{j=1}^i s_j$$and$$z_i=\sum_{j=i}^n e_j.$$ They also help with sparsity of the constraint matrix. Finally, we introduce continuous variables $x_i\in [0,1],\, i=1,\dots,n$, signaling whether a term is in the selection. The objective is, as you wrote, $\sum_{i=1}^n a_i x_i$.

Now we need to add constraints to make sure the $x$ variables contain a single contiguous block of ones. Those constraints are:$$x_i \le y_i\, \forall i,$$$$x_i \le z_i\, \forall i,$$and$$x_i \ge y_i + z_i - 1\, \forall i.$$The first two constraints say that, in order for $x_i$ to be positive, the block must start at or before position $i$ and end at or after position $i$. The final constraint says that if both those things happen, $x_i$ must be 1 (cannot be 0).

0
On

This can be easily done with an efficient implementation of a complete enumeration.

First compute $A_n = \sum_{i \le n} a_i$ for all $n$. Also set $A_0 = 0$. To find the subset of $m$ consecutive numbers that have the largest sum, determine the maximum of the quantities $C_k = A_{k+m} - A_k$ for $k = 0, \dots, n-m$.

The overall computational effort is $O(n)$.

It's easy to modify this so that it also works in an online mode (if the $a_i$ arrived one at a time).