Balls and bins lower bound with random algorithm

62 Views Asked by At

Assume we have $n$ bins and $B \sim Binom(nk,p)$ balls to distribute in an equitable.

Furthermore, assume that each ball arrives one at a time and must be placed in a bin before the next arrives. Lastly, assume that we use a randomized algorithm that randomly places a ball in a bin at each iteration. In the simpler setting with a "greedy" algorithm we can say that each bin will receive $kp$ balls in expectation, but what lower bound can we argue on each bin in the randomized setting? Can we easily give a result such that for some $\gamma$ we have $$Pr\left(\frac{1}{n} B < \gamma\right)$$ is very small (say $< 1/n$)?

1

There are 1 best solutions below

1
On BEST ANSWER

To show that each bin gets at least $\gamma$ balls with high probability, it is not enough to show that $\Pr[\frac1n B > \gamma]$ is high. Some bins are going to get fewer than $\frac1n B$ balls.

If the total number of balls is $B \sim \text{Binomial}(nk,p)$, then the number of those balls that end up in the $i^{\text{th}}$ bin is $B_i \sim \text{Binomial}(nk, \frac{p}{n})$. Here is the reasoning: we can think of $B \sim \text{Binomial}(nk, p)$ as saying "we have $nk$ balls total, but each one of them has probability $p$ of actually being placed in a bin". In that case, each of the $nk$ balls has probability $\frac pn$ of being placed in the $i^{\text{th}}$ bin in particular.

We can bound the probability that $B_i$ is small by a Chernoff bound. The expected number of balls in the $i^{\text{th}}$ bin is $nk \cdot \frac pn = pk$. Therefore $$\Pr[B_i \le (1-\delta) pk] \le e^{-\delta^2 pk/3}.$$ If we then want to bound the probability that any bin gets fewer than $(1-\delta)pk$ balls, then by the union bound it is at most $n e^{-\delta^2 pk/3}$.

If you want that probability to be $\frac1n$, then we want $e^{-\delta^2 pk/3} = \frac1{n^2}$, so $\delta^2 pk/3 = 2 \log n$ and $\delta = \sqrt{\frac{6 \log n}{pk}}$. So the threshold we can guarantee with this probability is $\gamma = (1 - \sqrt{\frac{6 \log n}{pk}})pk$, or $\gamma = pk - \sqrt{6pk \log n}$.