Using Z-tests to find $Pr(Y > X)$ with Binomial Random Variables

59 Views Asked by At

Quick note: this is a homework question, so don't feel the need to answer fully - prodding me towards the answer is welcome :)

I'm doing a "problem solving question" that has a bit of extra context around it, but the crux of the question is as follows.

Given:

$ X \sim Bin(10 000, 0.3)$ and $ Y \sim Bin(10 000, 0.31)$

$ X $ and $ Y $ are independent of each other.

What is the probability that $ Y > X $?

As part of an earlier question, I had to use the Central Limit Theorem to find the approximate the Normal distributions of $ X $ and $ Y $. I got that they were:

$ X \sim N(3000, 2100)$ and $ Y \sim N(3100, 2139)$ (I'm not sure about the notation here. Does it need to be something specific i.e. $ X_{Norm} $?)

I was sort of following the argument of letting:

$ Z = Y - X $

$ \therefore ~ Z \sim N(3100 - 3000, 1^2 \cdot 2100 + (-1)^2 \cdot 2139) = N(100, 4239)$

Then using a Z-test to find $ Pr(Z > 0) $:

$ \sqrt{10 000} \cdot \frac{0 - 100}{\sqrt{4239}} \approx -15359$

I'm almost certain that this is incorrect, because when I simulate it in MatLab I get a success rate for $ Y > X $ of around $ 93.7\% $.

Since I'm much better at programming than I am at maths, I'm inclined to believe that I've done something wrong. Have I? And is there a better way to do this, or a more succinct argument to use?

1

There are 1 best solutions below

4
On BEST ANSWER

After the approximating, you found the correct $Z$. This is also a normal random variable. Then the probability $P(Z>0)$ can be found from the standard normal table.

Note that $$ \frac{-100}{\sqrt{4239}} \approx -1.53. $$ $$ P(Z>0)\approx P(Z_0 > -1.53) = 1-\Phi(-1.53) = \Phi(1.53) \approx 0.937. $$

Here, $Z_0\sim N(0,1^2)$ is the standard normal random variable and $\Phi(z) = P(Z_0\leq z)$ is the cumulative distribution function (CDF) for the standard normal random variable.

Thus, your simulation $93.7\%$ is a correct approximated value.

So, there is no need to multiply by $\sqrt{10000}$ in your calculation.