Probability of Random Variable Minus Random Variable

1.1k Views Asked by At

$X_1 , X_4$ ~ $ Binomial(18000,1/6)$.

So $X_1+X_4$ ~ $Binomial(18000,1/3)$.

I am asked to find $P(X_1-X_4)\leq 80)=?$.

The solution is to find $Var(X_1-X_4)=6000$, $E[X_1-X_4]=0$ and then do the following: $$ P(X_1-X_4\leq 80)=P(\frac{X_1-X_4} {\sqrt {6000}}\leq \frac {80} {\sqrt {6000}})=P(\frac{X_1-X_4} {\sqrt {6000}}\leq \frac {80} {\sqrt {6000}}) = P(\frac{X_1-X_4} {\sqrt {6000}}\leq1.033)=0.1508 $$

My question is how did the solver do this: $$ P(\frac{X_1-X_4} {\sqrt {6000}}\leq1.033)=0.1508 $$

It looks to me like he changed $X_1-X_4$ to the standard normal distribution, but that answer doesn't fit the standard normal distribution C.D.F, and I don't think that $X_1-X_4$ is standard distribution to begin with.

EDIT: Things I didn't say that are apparently important:

  1. $X_1, X_4$ are the odds of dice roll results 1 and 4 in 18,000 rolls, so as far as I know (And according to the solution I am using) this should mean that $X_1+X_4$ ~ $Bin(18000,1/3)$, the odds of $1$ or $4$ rolled in $18000$ rolls.
  2. $X_1$ and $X_4$ are not independent: $$ \frac {1} {3}\cdot \frac{2}{3}\cdot 18000 = V(X_1+X_4) = V(X_1)+V(X_4)+2Cov(X_1,X_4) $$ $$ \Rightarrow Cov(X_1,X_4)=\frac{-18000}{36} $$

$$ V(X_1-X_4) = V(X_1)+V(X_4)-2Cov(X_1,X_4)=6000 $$

1

There are 1 best solutions below

3
On BEST ANSWER

Actually, if $X_1 , X_4 \sim \mbox{Binomial}(18000,1/6)$, then $X_1 + X_4 \sim \mbox{Binomial}(36000,1/6)$ if the distributions are independent.

For the distributions as further described in the edited question, the sum $X_1 + X_4$ comes to how often you roll 1 or 4 in 18000 rolls of a die, which amounts to 18000 Bernoulli trials with $p = 1/3$, so yes, $X_1 , X_4 \sim \mbox{Binomial}(18000,1/3)$.

As you correctly observed, the distribution of $X_1 - X_4$ is not a normal distribution. In fact it is the sum of 18000 iid variables of the form $Y_i$ where $$ \begin{eqnarray} P(Y_i = 1) &=& \frac16, \\ P(Y_i = -1) &=& \frac16, \\ P(Y_i = 0) &=& \frac23. \end{eqnarray} $$ But presumably it is "close enough" to a normal distribution that the calculation in the solution will match the correct probability to at least four decimal places.

Each $Y_i$ has variance $\frac29$, so their sum indeed has variance $6000$. The claim of the solution appears to be that $X_1 - X_4$ has approximately the distribution $N(0,6000)$, which seems not completely unreasonable.

But if $Z$ is a standard normal random variable, $P(Z \le 1.033) = 0.8492 = 1 - 0.1508$, not $0.1598$, so the "solution" is not quite right.