Efficiently distributing $N$ items in $M$ buckets

1.2k Views Asked by At

There are $d$ buckets, each one with capacity $[0,g]$. A quota of identical items $l\in \mathbb{N} $ is uniformly drawn from $[0,g\cdot d]$ and the items have to be distributed among buckets randomly. Buckets may be empty, or filled with at most $g$ items.

One trivial algorithm: Keep a set of all unfilled buckets $U$, then loop $l$ times each time uniformly sample from $U$ and place one item in that bucket, while removing the bucket from $U$ if it gets filled. This runs in $O(l \cdot log(d))$

Is there a way to do this more efficiently? In particular, I need $g$ to be able to scale to very large numbers. For example 5 buckets with capacity 1,000,000.

Note: It's important that $l$ is drawn uniformly, if we were to simply uniformly sample for each bucket how many items will be in each, then $l$ would be distributed according to the Irwin-Hall distribution, which we do not want. Otherwise that would have been a very efficient solution.

=============

Here is some more background on where this comes from:

A d-dimensional grid of grid size g+1. For example in 2-D, the grid goes from (0,0) to (g+1,g+1). Particles move from the origin (0,0) to the opposite corner. I need to sample on the position of a particle and it has to be unbiased. The length l of the path to the opposite side is g∗d and since particles are always moving one step at a time towards the goal, this distributes uniformly.

=============

Here is a python code for the solution given by Mike Earnest below:

import numpy as np

def distrib(l, g):
    d = len(g)
    a = [0] * d
    if l <= 0:
        return a
    
    # Sampling phase
    used = 0
    for i in range(d):
        a[i] = 0 if g[i] == 0 else np.random.binomial(n=l-used, p=1 / (d-i))
        used += a[i]
        
    # Siphon overflow into recursive call
    used = 0
    g_rem = []
    for i in range(d):
        a[i] = min(g[i], a[i])
        used += a[i]
        g_rem.append(g[i] - a[i])
    l_rem = l - used
    c = distrib(l_rem, g_rem)
    
    # Combine
    res = []
    for i in range(len(g)):
        res.append(a[i] + c[i])
    return res

```
2

There are 2 best solutions below

1
On BEST ANSWER

Here is a more efficient way to randomly distributed into the buckets. I believe this exactly produces the distribution that you are after. I am posting a new answer because this is very different from my other approach.

First, we fill one bucket at a time using an appropriate binomial distribution, while ignoring the upper limit of $g$. Then, after all buckets are filled, we siphon off the overfull buckets, and redistribute the overflow items into the remaining under-full buckets using the same process. To make the recursion work nicely, it helps to allow non-uniform capacities, instead of just a uniform $g$ for each bucket.

Input: A nonnegative integer $\ell$, and a list $[g_1,g_2,\dots,g_d]$ of nonnegative integers.

Output: A random list $[B_1,\dots,B_d]$ of nonnegative integers, such that $B_1+\dots+B_d=\ell$ and $0\le B_i\le g_i$ for each $1\le i\le g$.

Method:

  1. Generate a list $[A_1,\dots,A_d]$ as follows. For each $i\in \{1,\dots,d\}$, first check if $g_i=0$.

    • If $g_i=0$, then $A_i=0$.
    • If $g_i>0$, then $A_i$ is generated according the following binomial distribution: $$A_i\sim \text{Bin}(\ell-(A_1+\dots+A_{i-1}),1/(d-i+1))$$
  2. Compute the following quantities: \begin{align} \ell^{\text{rem}}&=\sum_{i=1}^d \max(A_i-g_i,0),\\ \text{for each $1\le i\le d$}, \quad g_i^\text{rem}&=\max(g_i-A_i, 0) \end{align}

  3. Generate a list $[C_1,\dots,C_d]$ using a recursive call to this algorithm with parameters $\ell^{\text{rem}}$ and $[g_1^\text{rem},\dots,g_d^\text{rem}]$.

  4. Output the list $[B_1,\dots,B_d]$, where $B_i=\min(A_i, g_i)+C_i$ for each $1\le i\le d$.

Steps $1, 2,$ and $4$ each take $O(d)$ steps. Each time we recurse, there is at least one more full bucket, so this process will loop at most $d$, resulting in $O(d^2)$ worst case performance. I suspect that the average case performance is closer to $O(d)$, but I cannot prove this.

11
On

With a little extra book-keeping, you can get away with choosing a random uniform amount for each bucket. You just need to sometimes tighten the uniform interval to ensure you do not run out of space. I should describe the process in pseudo-code, but hopefully Python code is readable enough for you.

from random import randint

def random_distribution(d, g, l):

    # Input:  d, g, l are positive integers, 0 < l < d * g
    # Output: A random list of d nonnegative integers, each between 0 and g, summing to l.

    buckets = [0] * d
    num_remaining = l

    for i in range(d - 1):

        # The i^th bucket is filled with a random number 
        # between lower_limit and upper_limit.
        # - lower_limit is zero by default, but is increased 
        #   to ensure the remaining items can fit in the remaining buckets.
        # - upper_limit is g by default, but is decreased 
        #   to ensure num_remaining does not go negative. 

        max_capacity_of_later_buckets = g * (n - i - 1)
        lower_limit = max(0, num_remaining - max_capacity_of_later_buckets)
        upper_limit = min(g, num_remaining)
        next_bucket_amount = randint(lower_limit, upper_limit)
        buckets[i]    += next_bucket_amount
        num_remaining -= next_bucket_amount
    
    # Place all remaining items in last bucket. 
    buckets[d-1] += num_remaining
    return buckets

Clearly, this is $O(d)$ (in terms of number of arithmetic operations), which is faster than $O(l)$. In particular, the run time does not increase as $g$ increases.

The only problem with this method is that while all of the buckets will be random, they will not all have the same random distribution. For example, when $l$ is on the small end of $[0,dg]$, there will be lots of zeroes in the bucket sizes, and these zeroes will tend to clump at the end of the bucket list. This can be fixed by randomly permuting the buckets afterwards, which restores symmetry. Using the Fisher - Yater shuffle, randomization also takes $O(d)$ time.