Show that X-Y does not have a Binomial distribution

4k Views Asked by At

If $X $~ $Bin(n, p)$ and $Y$ ~ $Bin(m, p)$ I want to show that $X-Y$ is not binomial.

I tried to use the technique of this proof given in lecture that Sum of two independent binomial variables is binomial with $X+Y$ ~ $Bin(n+m, p)$ analogous for the situation $X-Y$.

I suspect that $X-Y$ should be ~ $Bin(n-m, p)$, and therefore not binomial for the case that $m$ > $n$ as that would a negative amount of independent Bernoulli trials with a positive probability p; by definition of the Binomial distribution this makes no sense. Probability of "success" in zero or less Bernoulli trials should be zero, not positive

This would seem to agree with the solution given by the book:

"A Binomial can’t be negative, but X −Y is negative with positive probability." But is $X-Y$ a negative, or is it only negative for $m > n$?

Is this the right train of thought?

If so, is there a mathematical way to express $X-Y$ ~ $Bin(n-m, p)$?

2

There are 2 best solutions below

7
On

No, the difference does not follow a binomial distribution, even when $m\le n$. The argument in the book is simply that $X-Y$ can be negative while a binomial cannot, and that suffices to prove the claim. Even when $m\le n,$ you can still have $X=0$ and $Y=1$ (Except in the trivial case where $m=0.$)

However, if you are in the regime where a binomial can be approximated by a normal, then the difference between two approximately normal things will be approximately normal (but that's beside the point here).

1
On

$X+Y$ is binomially distributed since the sum of counts of successes in two independent sequences of $n$ and $m$ iid Bernoulli trials, each with success rates $p$, is equivalently stated as being the count of successes in a sequence of $n+m$ iid Bernoulli trials each with success rate $p$.


$X-Y$ is not binomially distributed because the difference between counts of successes in two independent sequences of $n$ and $m$ Bernoulli trials , each with iid success rates $p$, just cannot be expressed as being a count of successes in a sequence of some amount of iid Bernoulli trials.

Also as noted, when $X$ realises values less than $m$ it is possible for $X-Y$ to be negative, which is not the case for any Binomially Distributed random variable .