How to solve this combinatorial problem with guaranteed bound

110 Views Asked by At

Let $a_i \in R^n$ and $c_i \in R$. Solve

$$ \max_{x \in \{0, 1\}^n} \sum_{i=1}^m c_i \max \{0, a_i^\top x\}, \quad \text{subject to } \sum_{j=1}^n x_j = k $$

for some $k \in N$. If the exact solution is NP-hard, is it possible to have an approximate solution with guaranteed bound?

I can think of a few approximate solutions such as greedy and Lagrange dual. Or relax the $0/1$ constraint of $x_i$ into $[0, 1]$, followed by setting the $k$ greatest $x_i$ into $1$ (and the rest to $0$). Since this is a discrete problem and a solution with provable bound is sought, I have little clue how to even start. Submodularity doesn't hold here either.

The question arises from a variable selection problem. $x$ corresponds to the selector with a budget $k$, and the entries in each $a_i$ are the covariates of example $i$. Max with $0$ is introduced to get some nonlinearity (like the rectified linear unit). $c_i$ corresponds to the target variable. Some simplifications from the original problem are applied, allowing us to focus on the key underlying challenge.

1

There are 1 best solutions below

3
On

If the scale of this problem is not large enough (i.e., $n$ is not too large), you can solve it by MIP, and software package gurobi or cplex can help you.

If the scale is very large, maybe you can reformulate this problem by \begin{equation} \begin{array} \ {\max_{t, x}} & {\sum_{i=1}^m c_i t_i} \\ & {t_i \ge a_i^{\mathsf{T}}x, \quad t_i \ge 0}\\ & {t_i(t_i - a_i^{\mathsf{T}}x) \leq 0} \\ & {x_j(x_j-1) \leq 0, \quad j=1,\dots,n} \\ & {x_j(x_j -1) \ge 0, \quad j=1,\dots,n} \end{array} \end{equation} And then this problem can be solved by DC programming. you can refer to CCP.