Why same Bernoulli method lead to different answers?

36 Views Asked by At

I got one probability expectation problem and solved with two methods (both based on Bernoulli). And both goes to different result. I can't figure out what's the difference.

Problem)

There're 5 families with 6 kids each. How much families out of 5 are expected to have 4 or more daughters? (probability of having son or daughter is both 1/2)

1st method)

this is the one written in the book.

P(D>=4) = Probability that each families have 4 or more daughters (including 4,5, 6 daughters)

$$ P(D \geqq 4 ) = P(D = 4) + P(D=5) + P(D=6) $$ $$ = 6C4 \times (\frac{1}{2})^6 + 6C5 \times (\frac{1}{2})^6 + 6C6 \times (\frac{1}{2})^6 = \frac{11}{32} $$ (C : combination)

The amount of families that have 4 or more daughters (we will call this 'X') now follows Bernoulli method of $ B(5,\frac{11}{32}) $. $$ E(X) = 5\times\frac{11}{32} =\frac{55}{32} $$

2nd method)

this is what I thought.

D = the amount of daughters in one family.

D follows $B(6, \frac{1}{2})$. Becauses it's like trying thing with probability of 50% repeatedly for 6 times.

$E(D) = 6\times\frac{1}{2} = 3 $

$V(D) = 6\times\frac{1}{2}\times\frac{1}{2} = \frac{3}{2} $

$\sigma(D) = \sqrt\frac{3}{2} $

On the Normalized Distribution graph (N, the horizontal axis called 'z') made of Bernoulli trial, let me get portion of $(D\geqq 4) $ part.

$$ P(D\geqq4) = N(z \geqq \frac{4 - 3}{\sqrt\frac{3}{2}}) = N(z \geqq 0.816409) = 0.209 $$

So, if you make families with 6 kids, it has 0.209 probability of having 4 or more daughters. So if you makes 5 families, the expected amount of families having 4 or more daughters is $$ 5 \times 0.209 = 1.045 $$

However, it seems so different from $\frac{55}{32}$ of the exp value from method 1. Why are these difference happening? And which method is more proper??

Thank you genius!

1

There are 1 best solutions below

1
On BEST ANSWER

First off, the normal approximation is just that: an approximation. You won't get the exact same result.

That being said, you should get a much better result. And the mistake you're making in your approach 2 is in translating $D\geq 4$ to a normal variable. If $X$ is normal distributed with mean $3$ and standard deviation $\sqrt{3/2}$, then the range of $X$ that corresponds to $D\geq 4$ is not $X\geq 4$, it's $X\geq 3.5$ (the case $D = 4$ corresponds to $3.5\leq X\leq 4.5$).

So you actually want $$ N\left(z\geq \frac{3.5-3}{\sqrt{3/2}}\right) \approx 0.341546 $$ which gives a final answer that is a lot closer to $\frac{55}{32}$.