Upper Bound on the Probability of the Difference of Binomial Distributions

98 Views Asked by At

We start by defining a binomial difference distribution

Let $X\sim \text{Bin}(n, p)$, $Y\sim \text{Bin}(n, q)$, $Z=X-Y$

I've found out that this distribution is somewhat difficult to write down exactly for arbitrary $n$.

What I'm interested in is $P(Z > 0)$, which I've discovered to be some polynomial of $p$ and $q$ of degree $n$.

(I computed a few with Mathematica to try and get some intuition)

$$\left( \begin{array}{ll} \text{n}=1 & p-p q \\ \text{n}=2 & -3 p^2 q^2+4 p^2 q-p^2+2 p q^2-4 p q+2 p \\ \text{n}=3 & -10 p^3 q^3+18 p^3 q^2-9 p^3 q+p^3+12 p^2 q^3-27 p^2 q^2+18 p^2 q-3 p^2-3 p q^3+9 p q^2-9 p q+3 p \\ n=4 & -35 p^4 q^4+80 p^4 q^3-60 p^4 q^2+ \,\ldots \end{array} \right)$$

Since these polynomials become increasingly complex I'm looking for a way to write a tight upper bound in the form $P(Z > 0) \leq f(p, q, n)$. If this is too hard I'm also satisfied with a bound for the case of $p=q$

My only approach has been to use Hoeffding's inequality on the case where $p=q$ to bound $P(Z > 0)\leq \exp(-1/2n)$ by interpreting $Z$ as the sum of independent rvs between -1 and 1.

Any other suggestions for a tighter bound would be appreciated.

Thanks

EDIT:

I've come up with one tighter bound for $p=q$ by solving it for $p=1/2$ which should be maximal, because the Binomial distribution has the highest variance for $p=1/2$.
$$P(Z>0)\leq \frac{1}{2}-\frac{2^{-2 n-1} (2 n)!}{(n!)^2}$$

I have not proved this result, I just discovered it computationally. So it would be nice to know if this result is known and if a proof exists.

1

There are 1 best solutions below

0
On

After a bit more work I found a proof for the bound $$P(Z>0)\leq \frac{1}{2}-\frac{2^{-2 n-1} (2 n)!}{(n!)^2}$$

Here is a simple sketch:

  1. $p=1/2$ maximizes the variance of Bin(n, p) which maximizes $P(Z>0)$
  2. So $P(Z>0)\leq P(Z | p=1/2)$ (abusing notation a bit)
  3. From now on let $p=1/2$ and we will solve $P(Z>0)$ exactly
  4. Note that $P(Z>0)=P(X>Y)=1-P(X=Y)-P(X<Y)$
  5. $P(Z>0)=\frac{1}{2}\left(1-P(X=Y)\right)$
  6. define $f(n,k)=P(X-Y=k)$ for $0\leq k \leq n$
  7. $f(n, k) = \frac{1}{2} f(n-1, k) + \frac{1}{4} f(n-1, k-1) + \frac{1}{4} f(n-1, k+1)$
  8. Plug in the solution $$f(n, k)= \frac{4^{-n} (2 n)!}{(n-k)! (k+n)!}$$ and show that it solves the recurrence and appropriate base cases.
  9. Set $k=0$ to get $$P(X=Y)=\frac{4^{-n} (2 n)!}{(n!)^2}$$
  10. plug step 9 into step 5

I'm now realizing there is probably a simple combinatorial argument, but hopefully this method is appreciated.

I'm still wondering if there is a tighter result that is a function of $n$ and $p$.