How likely is a specific number X of empty buckets K given N balls?

97 Views Asked by At

Is there a simple formula or distribution curve to answer a question like this?

Assume there are K buckets and we want to randomly assign N balls to them. Each ball has an equal chance of being assigned to any of the buckets. There are more balls than buckets. When all balls have been assigned, what is the likelihood that X (or more) buckets remain empty?

To make it more concrete. Given 146 balls, and 77 buckets, with each ball being randomly assigned to one of those buckets, how likely/unlikely are there to be 35 empty buckets when all balls have been assigned?

If this simplifies the problem, the question could be asked, how likely/unlikely are there to be at least 35 empty buckets when all balls have been randomly distributed?

Would like to know if there is a way to characterize this as: that is a 1 in 1000 chance, or a 1 in a million chance, for example.

2

There are 2 best solutions below

0
On

If the number of balls and buckets is reasonably large, the number of balls in a specific bucket will approximate a Poisson distribution. The linearity of expectation then gives the expected number of empty buckets.

For low chances you can calculate the chance that a given set of buckets is empty and multiply by the number of ways to select a set of that many buckets. This takes advantage of the fact that if getting $35$ empty buckets is small the chance of getting $36$ is much smaller. When you multiply by the number of ways of getting $35$ empty you will count cases of $36\ 36$ times each, cases of $37\ {37 \choose 2}$ times each and so on.

Using your numbers as an example, the average number of balls per bucket is $\frac {146}{77} \approx 1.896$. The chance a given bucket is empty is about $0.012987$. The chance to have a given set of $35$ buckets empty is $\left(\frac {42}{77}\right)^{146}\approx 3.688 \cdot 10^{-39}$. The chance of at least $35$ empty is then less than ${77 \choose 35}$ times this, which gives $3.688 \cdot 10^{-17}$ We can probably ignore the overcounting in this case.

0
On

Let's start by looking at all the number of possible end states. I'm going to assume from the statement of your problem that the buckets themselves can be distinguished from one another, but that the balls cannot. So our final state can be represented simply by writing, in order, how many balls end up in each of the $k$ buckets. This is equivalent to finding a list of $k$ non-negative integers, in order, that sum to $n$, the total number of balls. The number of possible ways you can do this has a really nice explanation in this other question. I recommend you check it out. At any rate, that gives us that the total number of possible end states for our experiment is given by

\begin{equation} \binom{n+k-1}{k-1} \end{equation}

Then, the question becomes how many of those possible end states have exactly $x$ empty buckets. We can think of this in two parts. First, we have the number of ways we can choose which buckets should be empty, which is simply $\binom{k}{x}$. Then, we must consider how many possible ways the remaining $k-x$ buckets can be filled with the $n$ balls, keeping in mind that none of these buckets can be empty. The formula for the number of possible ways to put $n$ balls into $k$ buckets without leaving any buckets empty is also given in the question I liked above, and it turns out to be $\binom{n-1}{k-1}$. We can substitute in our problem's information to get $\binom{n-1}{k-x-1}$ in our case. Then the product of these two expressions is the total number of end states which have exactly $x$ empty buckets:

\begin{equation} \binom kx \binom{n-1}{k-x-1} \end{equation}

Finally, the probability that exactly $x$ buckets are empty is simply the number of end states in which $x$ buckets are empty, divided by the total number of end states:

\begin{equation} P(X \mathrm{~empty~buckets}) = \frac{\binom kx \binom{n-1}{k-x-1}}{\binom{n+k-1}{k-1}} \end{equation}

Plugging in your example values of $n=146,k=77,x=35$, I find a probability of approximately $0.00462$ that there are exactly $35$ empty buckets.