Algorithm for allocating numbers into groups such that the maximum number of elements in the groups is minimized

38 Views Asked by At

Suppose there is a set of arrays $S= [A_1, A_2, ..., A_k] $, where $A_i$ is a finite subset of $\mathbb{Z}$. Given a positive integer $g$, I want to build $g$ sets $G_1,G_2,\dots,G_g$ such that $\forall i\in[k], \exists j\in[g], A_i \subseteq G_j$ and I want the maximum size of the groups minimized, i.e., $$\min \max_j |G_j|.$$

For example, $S=[[1,2,3],[2,5],[4,6,7]]$, $g=2$, then the optimal groups are $G_1=[1,2,3,5], G_2=[4,6,7]$, and the objective value is 4.

I wonder what's the optimal algorithm (or any algorithm better than enumerating all possible groups) for solving this problem?

2

There are 2 best solutions below

0
On

You can show that your problem is $\mathcal{NP}$-hard by reduction from multi-processor scheduling (take all the $A_i$ disjoint, and the size of the $A_i$ as the length of the tasks), so there is probably no polynomial time algorithm for this problem. Your problem (with $A_i$ not necessarily disjoint) can be seen as a subproblem of submodular scheduling (like for submodular bin-packing).

If $g$ is fixed and that all the $A_i$ are disjoint, then the problem is polynomial by using a dynamic programming algorithm (weakly polynomial for scheduling, but strongly polynomial in your case). I don't see how the dynamic programming algorithm can be adapted in the general case, but maybe you can do something about it.

Note that there always are better algorithms than enumerating all possible groups, even when the problem is $\mathcal{NP}$-hard. Depending on the size of the instances at hand, full generation can be good enough for small instances, otherwise, you can do a branch and bound with some clever bound on your cost function, or model the problem with mathematical programming if your instances are very large.

0
On

You can solve the problem via integer linear programming as follows. Let binary decision variable $x_{ij}$ indicate whether set $A_i$ is assigned to group $G_j$. Let binary decision variable $y_{kj}$ indicate whether element $k$ appears in group $j$. Let nonnegative integer decision variable $z$ represent $\max_j |G_j|$. The problem is to minimize $z$ subject to linear constraints \begin{align} \sum_j x_{ij} &= 1 &&\text{for all $i$} \tag1\label1 \\ x_{ij} &\le y_{kj} &&\text{for all $i$, $k\in A_i$, and $j$} \tag2\label2 \\ \sum_k y_{kj} &\le z &&\text{for all $j$} \tag3\label3 \end{align} Constraint \eqref{1} assigns each set to exactly one group. Constraint \eqref{2} enforces the implication: if $A_i$ is assigned to $G_j$, then all elements of $A_i$ appear in $G_j$. Constraint \eqref{3} enforces the min-max objective.

A weaker but smaller formulation replaces \eqref{2} with the aggregation $|A_i|x_{ij} \le \sum_{k\in A_i} y_{kj}$ for all $i$ and $j$.