Probability of extracting k different coloured balls in a sample of m>k balls.

228 Views Asked by At

I am having some problems with the next excercise:

Suppose an urn with balls of only six different colours where the chances to obtain each colour is exactly the same for all of them in each extraction. i.e, the probabiblity doesn't change from one iteration to the next one.

Now, let's extract a sample of 10 balls and compute the probability of obtaining the six different colours in it.


This might be a dumb question but I have no idea about how to proceed. Please, help me.

2

There are 2 best solutions below

0
On BEST ANSWER

This is an example of the famous coupon collector's problem which is part of a more general probability problem called the classical occupancy problem. It can be reframed as follows:

Suppose we have $n=10$ balls and $m=6$ (coloured) bins. We allocate the balls to the bins at random. Find the distribution of the number of non-empty bins (classical occupancy problem), and the probability that all bins are non-empty (coupon collector's problem).

Let $K$ be the number of colours obtained from this allocation (i.e., the number of bins that are non-empty). The distribution of this random variable is the classical occupancy distribution which has mass function (see related question):

$$\text{Occ}(k|n,m) = \frac{(m)_k \cdot S(n,k)}{m^n} \quad \quad \text{for all } k = 1,2,..., \min(n,m).$$

(In this expression the values $(m)_k = m (m-1) \cdots (m-k+1)$ are the falling factorials and the values $S(n,k)$ are the Stirling numbers of the second kind.) For the coupon collector's problem we are interested in the probability that $K=m$, which is:

$$\text{Occ}(m|n,m) = \frac{m! \cdot S(n,m)}{m^n} \cdot \mathbb{I}(m \leqslant n).$$

For your particular problem the desired probability for the coupon collector's problem is:

$$\text{Occ}(6|10,6) = \frac{6! \cdot S(10,6)}{6^{10}} = \frac{720 \cdot 22827}{60466176} = \frac{16435440}{60466176} = 0.2718121.$$


Confirming this by simulation: We can easily simulate the occupancy number in R by simulating the underlying allocation problem. Taking $S = 10^6$ simulations gives the following sample proportions for the occupancy number.

#Simulate this problem S times
set.seed(1);
S     <- 10^6;
BALLS <- array(ceiling(6*runif(10*S)), dim = c(S, 10));

#Count number of occupied bins
OCC   <- rep(0, S);
for (s in 1:S) { OCC[s] <- dim(table(BALLS[s,])) }

#Show sample proportions of outcomes
table(OCC)/S;

OCC
       2        3        4        5        6 
0.000279 0.018522 0.203018 0.506310 0.271871 

We can see that the proportion of cases where $K=6$ closely matches our probability calculation using the classical occupancy distribution.

0
On

Hint:

There are two approaches to counting the number before dividing by $6^{10}$