Subtraction between two binomial variables

5.8k Views Asked by At

Assuming X and Y are two independent binomial random variables:

$X$~$B(n_x,\alpha)$, $Y$~$B(n_y,\alpha)$

What can be said about the distribution of $Z$ where $Z=Y-X$.

Specifically does $Z$ PDF is a unimodal function?

2

There are 2 best solutions below

0
On BEST ANSWER

Yes, the function $f(z) = \mathop{\text{Prob}}(Z=z)$ is a unimodal function of the integer $z$.
[Hm, is it kosher to call $f$ a "probability density function" in this discrete setting?] More generally, this is true for any variable of the form $Z = \sum_{i=1}^n X_i$ where the $X_i$ are independent random variables each taking the value $1$ with probability $p_i$ and $0$ with probability $q_i = 1 - p_i$. To see that our $Z=X-Y$, with $X \sim B(n_x,\alpha)$ and $Y \sim B(n_y,\alpha)$, is indeed a special case, set $n=n_x+n_y$, $\;p_i = \alpha$ for $i \leq n_x$, and $q_i = \alpha$ for $i > n_x$, and then shift by $n_y$.

Once the result is stated in this more general form, it can be proved by induction on $n$. The base case $n=0$ is clear, because then $f(0)=1$ and $f(z)=0$ for all $z \neq 0$. Assume we have proven the result for $n-1$, so $Z_1 = \sum_{i=1}^{n-1} X_i$ has PDF $f_1(z) = \mathop{\text{Prob}}(Z_1=z)$ unimodal with maximum $z_1$. Then $f(z) = p_n\,f_1(z) + q_n\,f_1(z-1)$, so $f(z)\leq f(z')$ for $z \leq z' \leq z_1$ and $f(z)\geq f(z')$ for $z_1 < z \leq z'$. Therefore $f$ is unimodal with maximum $z_1$ or $z_1+1$. This completes the induction step and the proof.

The same technique proves more generally that the function $\lambda^z f(z)$ is unimodal for all $\lambda > 0$; equivalently, $f$ is not just unimodal but log-concave: $f(z)^2 \geq f(z-1) \, f(z+1)$ for all $z$ (and in our case equality holds only when $f(z)=0$).

0
On

The modality component is already answered. This answer derives a 'closed-form' solution for the pmf. If I may change the notation slightly, and make it more general (allowing each Binomial to have different parameters):

The Problem

Let $X_1$ ~ $Binomial(n,p)$ and $X_2$ ~ $Binomial(m,q)$ be independent.

Find the pmf of $Z = X_1-X_2$

Given: Due to independence, the joint pmf of $(X_1, X_2)$, say $f(x_1,x_2)$, is:

enter image description here

Solution

Let $Z=X_1-X_2$ and $Y=X_2$. Then, the joint pmf of $(Z,Y)$, say $g(z,y)$, is:

enter image description here

where I am using the Transform function from the mathStatica package to automate the derivation of the joint pmf using the Method of Transformations (I am one of the authors of the package). Deriving the domain of support of $Y$ and $Z$ is a bit more tricky. To make things clearer, here is a rough diagram that illustrates the (smoothed continuous version of) the domain of support:

enter image description here

This suggests two cases:

  • Case 1: When $z \ge 0$: $0 \le y \le n-z$

  • Case 2: When $z < 0$: $-z \le y \le m$

The pmf of $Z=X_1-X_2$ is then obtained by summing out $Y$ in each part of the domain:

enter image description here


In summary: The pmf of $Z=X_1-X_2$, say $h(z)$, is:

$$ h(z) = \begin{cases}\text{pmfCase1} & z = 0,1, \dots, n\\ \text{pmfCase2} & z = -1, -2, \dots, -m \end{cases}$$


All done.

Here is a quick plot of the derived pmf of $Z=X_1-X_2$ (the blue round dots), when $n = 3, m = 9, p = .8, q =.4$. The red triangles are a quick Monte Carlo check super-imposed on top, just to make sure everything is correct ...

enter image description here