Simple Urn problem doesn't fit with experiment

66 Views Asked by At

Given an urn with $n$ numbered balls $[n] = \{1,...,n \}$, one chooses with repetitions $m$ balls. What is the probability that the set of chosen balls $M$ contains a fixed set $S \subset [n]$ of size $k$ (e.g. $S = [k]$, $S$ is a set, not a multiset)?
My answer is the following:
First, we need to choose the "places" for the $k$ balls in $S$, and we have ${m \choose k}$ ways to do so. Then, we need to multiply by $k!$ since we don't care about the order of which the elements were chosen. Then, the remaining $m-k$ places can be anything. There are overall $n^m$ possible choices, thus, $$P = \frac{{m \choose k}k! n^{m-k}}{n^m} = \frac{{m \choose k} k!}{n^k}.$$

I wasn't so sure of my math, so I wrote a simple python script which does the following: it samples 10 samples (with replacements) from 0-9, then it tests if we got all the numbers between 0-4. In the above notation, $k=5, m=n=10$. I run the code 100000 times, then divided the number of "true" runs in this number, i.e. $\hat{p} = \frac{\#\text{true runs}}{100000}$, which I believed give an estimator of my real quantity. However, the result I got was .07755, and the result from the formula is .3024.
Is the bug in my math or in the code? (I don't care about the code, so if the math is correct I'll be satisfied).

2

There are 2 best solutions below

0
On BEST ANSWER

To count the number of ways where every ball in $S$ is chosen...

  • Take all possible ways to choose the balls, $n^m$.

  • For each element of $S$, subtract the ways where that element is not chosen. There are $k$ elements of $S$, and $(n-1)^m$ ways for each to not be chosen, so subtract $k(n-1)^m$.

  • For each pair of elements of $S$, add back in the ways where both elements in the pair were not chosen (as these were double subtracted before). We add in $\binom{k}2(n-2)^m$.

  • For each triple, subtract the ways where every object in the triple was not chosen, $\binom{k}3(n-3)^m$.

  • $\vdots$

The result is $$ P(\text{every object in $S$ is chosen})=n^{-m}\sum_{i=0}^k(-1)^i\binom{k}i(n-i)^m. $$

0
On

I think you may be overcounting. To formalise this, you want to know the probability that a uniformly chosen ordered $m$-tuple from $\{1,...,n\}$ contains a fixed set of size $k$. Suppose $k =1$, $m=3$. Then the set $\{1\}$ is in the $3$-tuple $(1,1,1)$ but you count it thrice (one time for each of the coordinates). I suspect you will have to use inclusion-exclusion to count these more accurately.