Suppose I have independent random variables $X_i$ which are distributed binomially via $$X_i \sim \mathrm{Bin}(n_i, p_i)$$.
Are there relatively simple formulae or at least bounds for the distribution $$S = \sum_i X_i$$ available?
Suppose I have independent random variables $X_i$ which are distributed binomially via $$X_i \sim \mathrm{Bin}(n_i, p_i)$$.
Are there relatively simple formulae or at least bounds for the distribution $$S = \sum_i X_i$$ available?
On
One short answer is that a normal approximation still works well as long as the variance $\sigma^2 = \sum n_i p_i(1-p_i)$ is not too small. Compute the average $\mu = \sum n_i p_i$ and the variance, and approximate $S$ by $N(\mu,\sigma)$.
On
This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):
# explicitly combine two probability distributions, expecting a vector of
# probabilities (first element = count 0)
combine.distributions <- function(a, b) {
# because of the following computation, make a matrix with more columns than rows
if (length(a) < length(b)) {
t <- a
a <- b
b <- t
}
# explicitly multiply the probability distributions
m <- a %*% t(b)
# initialized the final result, element 1 = count 0
result <- rep(0, length(a)+length(b)-1)
# add the probabilities, always adding to the next subsequent slice
# of the result vector
for (i in 1:nrow(m)) {
result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
}
result
}
a <- dbinom(0:1000, 1000, 0.5)
b <- dbinom(0:2000, 2000, 0.9)
ab <- combine.distributions(a, b)
ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)
plot(ab.df$N, ab.df$p, type="l")
On
If you are looking for a software package to approximate the sum of non-identical binomial random variables, you can try this one:
https://cran.r-project.org/web/packages/sinib/index.html
Paper:
https://journal.r-project.org/archive/2018/RJ-2018-011/RJ-2018-011.pdf
See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).