Inequality for Binomial distribution function

219 Views Asked by At

Suppose $F(y;n,p)$ is the binomial distribution function, i.e. the probability that there are $y$ or fewer successes out of $n$ independent Bernoulli trials each with success probability $p$.
Is it true that for all positive integers $n$ and all $p \in (0,1)$ and all integers $y \in \{0,1,...,n-1\}$, we have $$F(y;n+1,p)<F(y;n,p)<F(y+1;n+1,p)$$

2

There are 2 best solutions below

0
On

The second inequality is true, because having not greater than $x$ successes in the first $n$ trials implies that you will not have more than $x+1$ successes in $n+1$ trials (even if the last trial is a success). It is a strict inequality, because since $x<n$ there is a non-zero probability that there will be $x+1$ successes among the first $n$ and the last one will be a failure.

The first inequality can't be true, because replacing $p$ by $1-p$ is the same as swapping successes with failures. The distribution function for failures is $1-F(x,n,p)$, so the inequality will be inverted.

Strictly speaking, suppose it is true. Let $$G(x,n,p) = F(x, n, 1-p) = 1 - F(n-x - 1, n, p)$$ Where the last equality comes from the fact that having not greater than $x$ failures is the same as having more than $n-x-1$ successes. Since the inequality must also hold for $G$, we have $$-F(n-x - 1,n+1,p) < -F(n-x - 1, n, p)$$ Calling $x' = n-x-1$, we rewrite this as $$F(x',n+1,p) > F(x',n,p)$$ This is a contradiction.

0
On

It's true.

Let $X_1,..,X_n$ be independent and identically distributed Bernoulli random variables such that $P[X_i=1]=1-P[X_i=0]=p$ and let $Y_n=\sum_{i=1}^n X_i$ and $Y_{n+1}=\sum_{i=1}^{n+1} X_i$.

Then, $$F(y;n+1,p)=P[Y_{n+1}\le y]=P[Y_n\le y-1 \text{ or } \{Y_n=y \text{ and } X_{n+1}=0\}]$$ $$=P[Y_n\le y-1]+P[Y_n=y \text{ and } X_{n+1}=0]=P[Y_n\le y-1]+P[Y_n=y]P[X_{n+1}=0]$$ $$=P[Y_n\le y-1]+P[Y_n=y](1-p)<P[Y_n\le y-1]+P[Y_n=y]=F(y;n,p)$$

Also, $$F(y+1;n+1,p)=P[Y_{n+1}\le y+1]=P[Y_n\le y \text{ or } \{Y_n=y+1 \text{ and } X_{n+1}=0\}]$$ $$=P[Y_n\le y]+P[Y_n=y+1 \text{ and } X_{n+1}=0]>P[Y_n\le y]=F(y;n,p)$$


The proofs of the two inequalities establish these recurrence relationships:

$$F(y;n+1,p)=P[Y_{n+1}\le y]=P[Y_n\le y-1]+P[Y_n=y](1-p)$$ $$=F(y-1;n,p)+{n\choose{y}}p^y (1-p)^{n-y+1}$$

and

$$F(y+1;n+1,p)=P[Y_n\le y]+P[Y_n=y+1 \text{ and } X_{n+1}=0]$$ $$=F(y;n,p)+{n\choose{y+1}}p^{y+1} (1-p)^{n-y}$$

The second recurrence relation is actually the same as the first with $y+1$ replacing $y$.

This R program can be used to check that they are true for any $n$ and $p$, but I don't see how either of them can be proven algebraically using the definition of the binomial distribution function $F(y)=\sum_{i=0}^y {n\choose i}p^i(1-p)^{n-i}$.

n=50
p=0.32
y=c(0:(n-1))
max(abs(pbinom(y,n+1,p)-(pbinom(y-1,n,p)+choose(n,y)*p^y*(1-p)^(n-y+1))))
max(abs(pbinom(y+1,n+1,p)-(pbinom(y,n,p)+choose(n,y+1)*p^(y+1)*(1-p)^(n-y))))

Mathematica doesn't seem to know the recurrence relation.

InputForm[
FullSimplify[
 Assuming[n>0 && Element[n,Integers] && y >= 0 && y < n && Element[y,Integers] 
              && p>0 && p<1, 
  Sum[Binomial[n + 1, i] p^i (1 - p)^(n + 1 - i), {i, 0, y}] - 
     (Sum[Binomial[n, i] p^i (1 - p)^(n - i), {i, 0, y - 1}] + 
     Binomial[n, y] p^y (1 - p)^(n - y + 1))]]]

evaluates in Mathematica to:
$(1 - p)^{n - y}*p^y* (Binomial[n, y]*(-1 + p + Hypergeometric2F1[1, -n + y, 1 + y, p/(-1 + p)]) - p*Binomial[1 + n, 1 + y]*Hypergeometric2F1[1, -n + y, 2 + y, p/(-1 + p)])$