Probability that m samples without replacement cover the entire set

111 Views Asked by At

Let $n$ be the size of the set $S = \{s_1, s_2, ..., s_n\}$ of unique items. Let $m$ be the number of players $p_1, p_2, ..., p_m$. Each player draws $k < n$ items from the set $S$ without replacement and with uniform probability, forming their own subset $p_i = \{p_{i, 1}, p_{i, 2}, ..., p_{i, k} \}$. Each player draws their item iid, so different players can hold the same item.

Given the size of the set $n$, the number of players $m$, and the sample size of each player $k$, what is the probability that the union of the sets of all the $m$ players cover the entire set $S$? i.e. that $p_1 \cup p_2 \cup ... \cup p_m = S$?

I'm not sure if this is a solved problem or not, so any analysis of the probability would also be appreciated (e.g. approximations, lower/upper bound, asymptotic behaviors)! Solutions other than closed form are also appreciated (e.g. recursive, polynomial-time algorithm).

1

There are 1 best solutions below

0
On BEST ANSWER

With the help of @lulu I found this closed form solution:

Fix $r$ elements in the set $S$. The probability that one player does not sample any of those $r$ elements is $\binom{n-r}{k}/\binom{n}{k}$. The probability that all $m$ players do not sample those $r$ elements is $(\binom{n-r}{k}/\binom{n}{k})^m$.

There are $\binom{n}{r}$ such ways to choose those $r$ elements. Therefore the probability that all the players do not sample any arbitrary set of $r$ elements is $(\binom{n-r}{k}/\binom{n}{k})^m \binom{n}{r}$. Let $P(r) = (\binom{n-r}{k}/\binom{n}{k})^m \binom{n}{r}$

Due to the principle of inclusion/excluson, the probability that all the players can cover the entire set $S$ is: $P^* = 1 - P(1) + P(2) - ... + (-1)^{n-k}P(n-k)$ $ = 1 + \sum_{i=0}^{n-k} (-1)^i (\binom{n-i}{k}/\binom{n}{k})^m \binom{n}{i}$

Note that we don't consider the probability of missing more than $n-k$ items, since those probabilities will be $0$.