Collection of sets that are supersets of a collection of required sets, that is optimal under a cost function

30 Views Asked by At

I have a set of elements, $U$.

I have a collection of subsets of $U$, which I call my requirements, $R$.

The problem: Find a $T$, a collection of subsets of $U$, such that $\forall r\in R, \exists t\in T \textrm{ s.t. } r\subseteq t$ that (locally) minimises $\sum_i |t_i|^2$.

I wondered if anyone knew if this falls within a formal class of problems for which I could use an off the shelf algorithm to solve. I don't know a lot about the area I'm afraid, so asking for help here. I checked out Integer linear programming and the set cover problem but I don't think either of these are appropriate. Equality constrained least squares was closer, but I couldn't formulate my problem in that way. It can be seen as combinatorial optimisation problem, but is there a good algorithm I could use?

Additional practical consideration:

The optimisation doesn't need to be that good for it to be useful to me.

If $N=|U|$ then I have N ~ 10000m where m might be between 2 and 200. $|R|=N$, and for each requirement, $r\in R$ there will be approximately of order $m$ other $r'$ which are similar to $r$.

1

There are 1 best solutions below

2
On BEST ANSWER

You can formulate the problem as weighted set covering. For each $t\subseteq U$, let binary decision variable $x_t$ indicate whether $t\in T$. The objective is to minimize $\sum_t |t|^2 x_t$, and the constraints are: $$\sum_{t:r\subseteq t} x_t \ge 1$$ for all $r\in R$.

If the problem is too large to solve directly, you can use dynamic column generation to introduce promising $t$ sets on the fly as needed.

An alternative heuristic approach is to cluster the $r$ sets and assign one $t$ for each cluster. If the resulting solution is good enough, you can stop. Otherwise, you can use these $t$ sets to initialize the column generation.