I am working on a balls in boxes kind of problem, where the probability of a ball ending up in a certain box varies by box, that is, each box has some probability P of getting any ball, all together summing to 1 obviously.
I'm trying to answer "what is the probability that after X balls are assigned to K boxes that any box has more than Z balls?" kind of questions.
I my cases, the number of boxes can be large (>500) with number of balls ~10-20% of the box count. Now, when the probability for all the boxes is uniform, I can short-cut the problem and get results quickly by getting all possible combinations of integer partitions with a maximum of at least Z, calculating the probability of such a combination, and multiplying that by the number of permutations of that combination, finally summing. But when they differ, I have to enumerate each of the possible permutations, calculate the probabilities, and sum. This gets huge quite quickly.
Question: is there any kind of reasonably accurate estimator for this kind of probability question that does not require total enumeration?
Here is an excellent approximation method. The approximation gets better for larger sized problems.
Notation: $x$=number of balls, $u$=number of boxes (or urns as statisticians prefer), $N_i$ is the number of balls in box $i$ and $p_i$ is the probability that a ball lands in box $i.$
We know $(N_1,...,N_u)$ is multinomial with binomial maginals for each $N_i$. Of course, $N_i$ and $N_j$ are dependent but the more boxes we have, the less the dependence. The dependence also is a function of the $p_i.$
I worked out the correlation: $$ \rho=-\sqrt{\frac{p_ip_j}{(1-p_i)(1-p_j)}} $$ If the probabilities are equal, $\rho=-\frac {1}{u-1} . $ Given such a small dependence, let's treat the $(N_i)$ as independent.
Then, for $z=1,2,...,x$ let $$F(z)=P(\max_{1\le i\le u} N_i\le z)$$ $$\approx \prod_{1\le i \le u}P(N_i\le z)$$ $$=\prod_{1\le i \le u} \sum_{m=0}^{z}{x \choose m} p_i^m (1-p_i)^{x-m} $$ Then $1-F(z)$ is the probability that at least one of the boxes contains more than $z$ balls.