Find overlapping outcomes in independent draws

40 Views Asked by At

I have an urn containing 100 numbered balls. I randomly draw 5 balls (5 %) in 5 individual draws with replacement.

How many unique balls do I get? I.e., what is the number of unique values?

I've written some R code to approximate it, but I have no idea how to do this in a more 'mathematical' way. E.g., a formula of some sort.

draw <- function(size){
   n <- 1:size
   sa <- lapply(1:5,function(x) sample(n,size*0.05))
   length(unique(unlist(sa)))/size
}
mean(unlist(lapply(1:1000,function(x) draw(100)))) # ~0.226
1

There are 1 best solutions below

0
On BEST ANSWER

We will use Linearity of Expectation.

Toward that end, let $X_i$ denote the indicator variable for the $i^{th}$ ball. That is, $X_i=1$ if the $i^{th}$ ball is in your selection and $=0$ otherwise. The Linearity tell us that the answer we want, $E$, is $$E=E\left[ \sum X_i\right]=\sum E[X_i]=100\times E[X_1]$$

Where, in the last equation, we used symmetry to conclude that all the $X_i$ have the same expected value.

To compute $E[X_1]$: note that this is simply $p$, the probability that we do in fact select ball $1$. The probability that we don't select it is $\left(\frac {99}{100}\right)^5$ so $E[X_1]=p=1-\left(\frac {99}{100}\right)^5\approx .049$

It follows that our answer is $$E\approx 4.9$$