Given a set S and n subsets of S: Find the subset of each of the n subsets, up to k elements, such that the union is maximized

105 Views Asked by At

Given a set S, and n subsets of S, which may be overlapping or disjoint: Find up to k elements from each of the n given subsets, such that the union of the elements selected from each subset is maximized.

In other words, if S = {1,2,3,4,5,6,7,8,9,10}, for n=3 the subsets given may look like {1,3,6,7,9,10}, {1,2,3}, {2,3,6,7,8}. Find a subset of each of these subsets such that the union of them is maximized. So the solution for k=3 may look like {7,9,10} {1,2,3} {6,8}. I am struggling to understand the best algorithm for this.

Another way to think about it is there are N provider points and C consumer points and each provider point can form an edge with up to k consumer points. Given the sets of consumer points that are reachable for each N, find the subset of consumer points with up to k elements, to connect to for each provider in order to maximize the number of consumer points connected to a provider.

2

There are 2 best solutions below

0
On

You can solve the problem via integer linear programming as follows. For $i\in [n]$, let $S_i$ be the given subset. For $i\in [n]$ and $j\in S_i$, let binary decision variable $x_{i,j}$ indicate whether element $j$ is selected from $S_i$. For $j\in S$, let binary decision variable $y_j$ indicate whether $j$ appears in the union. The problem is to maximize $\sum_{j\in S} y_j$ subject to linear constraints \begin{align} \sum_{j\in S_i} x_{i,j} &\le k &&\text{for $i\in [n]$} \tag1\\ y_j &\le \sum_{i\in [n]} x_{i,j} &&\text{for $j\in S$} \tag2 \end{align} Constraint $(1)$ enforces selecting at most $k$ elements from $S_i$. Constraint $(2)$ enforces the logical implication $y_j \implies \vee_{i\in [n]} x_{i,j}$.

0
On

You can solve this question by reformulating it using graph theory in a few ways.

Solution 1: Network flow

Let us create a bipartite graph with edge capacities with $|S|$ elements of set $S$ on partition $A$ and $n$ sets on partition $B$. Let there be an $1$ capacity edge from an element in $A$ to a set in $B$ if the set contains the element. Now to this bipartite graph, we add two vertices, source vertex and sink vertex. The source vertex is connected to all vertices in $B$ with $k$ capacity edges and the sink vertex is connected to all vertices in $A$ with $1$ capacity edges. Now this is a flow network and the maximum integer flow to this network will show which elements to choose in each set. This can be solved using the Ford–Fulkerson algorithm in $O(Ef)$ time. Since the maximum flow $f$ cannot be greater than $|S|$, the algorithm will run in $O(E |S|) = O(n|S|^2)$.

Solution 2: Bipartite matching

Let us create a bipartite graph with $|S|$ elements of set $S$ on partition $A$ and $n$ sets repeated $k$ times on partition $B$. Let there be an edge from an element in $A$ to a set in $B$ if the set contains the element. Now a maximum matching in this bipartite graph will give us the set of elements to choose in each set. The Bipartite matching problem can be solved by Hopcroft–Karp algorithm in $O(E \sqrt{V}) = O(n|k||S|\sqrt{n|k|+|S|})$ time.