Distribution of weighted sum of Bernoulli RVs

4.7k Views Asked by At

Let $x_1,...,x_m$ be drawn from independent Bernoulli distributions with parameters $p_1,...,p_m$.

I'm interested in distribution of $t=\sum_i a_ix_i,~a_i\in \mathbb{R}$

$m$ is not large so I can not use central limit theorems.

I have the following questions:

1- What is the distribution of $s=\sum_i x_i$?

2- What is the distribution of $t=\sum_i a_ix_i$ or $t=\sum_i a_ix_i-\sum_i a_i$ (to ensure non-negative support) for known $a_i$'s? can I approximate its distribution with a Gamma distribution? If yes, what would be the parameters (as a function of $p_i$'s and $a_i$'s)?

3- Is there a truncated Gamma distribution (or any other distribution (except normal)) that can approximately fits my problem?

However, $m$ is not very large, but it is still very large such that I can not calculate the distribution by convolution.

Thanks a bunch!

1

There are 1 best solutions below

3
On BEST ANSWER

I did some search and this is what I have found:

  • Bounds for tail probabilities of a weighted sum of Bernoulli trials
    This seminal paper of Raghavan in 1988: "Probabilistic Construction of Deterministic Algorithms: Approximating packing integer programs". In section 1.1 he derives some bounds, which seems to be important as part of an stochastic optimization technique called randomized rounding.
  • Learning the distribution of a weigthed sum of Bernoulli trials
    In this paper published this year by Daskalis et al.: "Learning Poisson Binomial Distributions". In theorem 2 they states that it's possible to construct an algorithm that learns the desired distribution in polynomial time.
  • Code to compute the distribution of a weigthed sum of Bernoulli trials
    This post in researchgate.net of Dayan Adoniel also asked the same and it seems that Dayan developed a code to compute the desired distribution and that he is willing to share it.

Unfortunately I do not have the time now to go over the details of the first two references but hopefully you can find them useful.


Edit 1: Bounds in the tails of the distribution of a weighted sum of independent Bernoulli trials [Raghavan, 1988]

Let $a_1, a_2, \ldots, a_r$ be reals in $(0, 1]$. Let $X_1, X_2, \ldots, X_r$ be independent Bernoulli trials with $E[X_j] = p_j$. Define $\Psi = \sum_{j=1}^r a_jX_j$ with $E[\Psi] = m$.

Theorem 1 (Deviation of $\Psi$ above its mean). Let $\delta > 0$ and $m = E[\Psi] > 0$. Then

$$\mathbf{P}(\Psi > m(1+\delta)) < \left[\frac{e^\delta}{(1+\delta)^{(1+\delta)}}\right]^m.$$

Theorem 2 (Deviation of $\Psi$ below its mean). For $\gamma \in (0,1]$,

$$\mathbf{P}(\Psi-m < -\gamma m) < \left[\frac{e^\gamma}{(1+\gamma)^{(1+\gamma)}}\right]^m.$$