Choosing a random subset (algorithmically) that has specified intersections with other given subsets

77 Views Asked by At

I am given

  • a finite set $X$,
  • $n$ subsets $X_1,...,X_n\subseteq X$,
  • $n$ integers $k_1,...,k_n\ge 0$, and
  • a number $K\ge 0$.

My goal is to choose a subset $Y\subseteq X$ with $|Y|=K$ and $|Y\cap X_i|\ge k_i$ for all $i\in\{1,...,n\}$. My eventual goal is to write a program that chooses $Y$ at random (I am not so much conerned with the distrubution right now, as long as it is not too pathological).

Question: How can I algorithmically make such a choice, or decide that it is not possible? Can this be done "efficiently"?

Usually, $K\approx 10$ and I expect $n\le 20$. Still, the attempts that I made required to compute all possible intersections of the $X_i$ and their complements, which is exponential in $n$.

1

There are 1 best solutions below

2
On

Thoughts too long for a comment.

I think that an eventual solution to your problem may not need an answer to the particular question you asked(*), Since your goal is a kind of random choice you may be able to use randomness from the start.

Clearly with $n \approx 20$ and $K \approx 10$ on average elements of $Y$ will have to belong to two of the $X_i$.

Consider choosing one element from each $X_i$ at random. Then check the resulting set to see if you can throw out all but $K$ of them and still have at least one from each $X_i$. If not, try choosing randomly again or patching the list by randomly replacing choices that live in just one (or two) $X_i$.

Since $n$ is smaller than $2k$ perhaps a similar random strategy will work if you start by precomputing the $k(k-1)/2$ pairwise intersections and use those sets somehow to make initial random choices.

In any case efficiency may depend as much on the data structures you use to represent your sets as on the algorithm in your program.

(*) https://en.wikipedia.org/wiki/XY_problem