Variance and mean of balls in bins limited capacity

Question

Variance and mean of balls in bins limited capacity

595 Views Asked by Bumbble Comm At 26 Mar 2026 - 7:59

Let there be $m$ indistinguishable balls, $k$ bins, $C$ capacity. Let $X_j$ denote the total balls in bin $j$. I've seen ways to calculate the total number of combinations, but I'm not sure how to go about calculating the mean and variance of $X_j$. It is understood that the ball is always thrown into one of the empty bins with uniform probability. edit: nonfull bins not empty bins sorry.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 16 Feb 2019 - 5:29

Here's some a Monte Carlo simulation in Python for the problem. Disclaimer: it's not efficient.

import numpy as np
import sys

def simulate(num_balls, num_bins, capacity, num_sims):
    assert(capacity * num_bins >= num_balls)
    results = np.empty((num_sims,))
    for i in xrange(num_sims):
        unfilled_bins = set(range(num_bins))
        bin_counts = np.zeros((num_bins,))
        for _ in xrange(num_balls):
            chosen_bin = np.random.choice(list(unfilled_bins))
            bin_counts[chosen_bin] += 1
            if bin_counts[chosen_bin] >= capacity:
                unfilled_bins.remove(chosen_bin)
                if chosen_bin == 0:
                    break  # Stop the simulation prematurely.
        results[i] = bin_counts[0]
    return results

if __name__ == '__main__':
    try:
        num_balls, num_bins, capacity, num_sims = map(int, sys.argv[1:])
    except:
        sys.stderr.write(('usage: {} NUM_BALLS NUM_BINS CAPACITY '
                          'NUM_SIMS\n').format(sys.argv[0]))
        sys.exit(1)
    total_capacity = capacity * num_bins
    if total_capacity < num_balls:
        sys.stderr.write(('error: parameters must satisfy'
                          'CAPACITY * NUM_BINS >= NUM_BALLS`.\n'))
        sys.exit(2)
    results = simulate(num_balls, num_bins, capacity, num_sims)
    mean = results.mean()
    var = results.var()
    sys.stdout.write('mean = {}\tvar = {}\n'.format(mean, var))
```

**Bumbble Comm** · Accepted Answer

This seems computationally very difficult to me. You might do better with simulation.

It $C\ge m$ then the constraint has no effect, and we are just looking for the mean and variance of the number of balls in bin $j.$ This has a binomial distribution, and the answer is well known.

When ${m\over k}\leq C< M,$ the situation is much more complicated, because the result depends on the sequence in which the balls were thrown into the bins. Before any bin fills up, a ball has probability $1/k$ of landing in bin $j.$ After some other ball fills up, it has probability $1/(k-1)$ of landing in bin $j$. The situation is even more complicated if it's possible for more than one ball to fill up.

Consider the case $k=3, j=1.$ Let $P(x_1,x_2,x_3)$ be the probablity that at some point there are $x_i$ balls in bin $i$ for $i=1,2,3.$ To compute the expectation, we have to sum up terms of the form $aP(a,b,c)$ where $a+b+c=m.$ If $\max\{a,b,c\}<C,$ we have simply $$P(a,b,c)={m\choose a,b,c}3^{-n}\tag{1}$$ But suppose $c=C.$ Then $$P(a,b,c)=P(a,b,C)=\frac12P(a-1,b,C)+\frac12P(a,b-1,C)+\frac13P(a,b,C-1)\tag{2}$$ taking account of all bins the last ball might have been thrown into.

The last term on the right-hand side of $(2)$ can be computed directly from $(1),$ but the first two have to be computed recursively from $(2)$. It won't take many bins, or many balls, before this computation becomes unwieldy, if the capacity constraint is effective.

There are so many possible sequences of throws that memoization is unlikely to be effective, so far as I can see. I've been tying to think of ways to simplify the calculations, but so far, I don't have a glimmer.

Variance and mean of balls in bins limited capacity

There are 2 best solutions below

Related Questions in COMBINATORICS

Related Questions in COMBINATIONS

Related Questions in VARIANCE

Related Questions in MEANS

Trending Questions

Popular # Hahtags

Popular Questions