Why this probability was calculated using Binomial Distribution?

2.6k Views Asked by At

The following is an exercise in this book (Discrete-Event System Simulation - Fourth Edition).

Exercise 5.3

A recent survey indicated that 82% of single women aged 25 years old will be married in their lifetime. Using the binomial distribution, find the probability that two or three women in a sample of twenty will never be married.

Solution

(From the book's solution manual)

Let X be defined as the number of women in the sample never married

P(2 ≤ X ≤ 3) = p(2) + p(3)

= $ \binom{20}{2} (.18)^2 (.82)^{18} + \binom{20}{3} (.18)^3 (.82)^{17} $

= .173 + .228 = .401

My Question

If I understand it correctly, the binomial distribution is a discrete probability distribution of a number of successes in a sequence of n independent yes/no experiments.

But choosing 2 (or 3) women from a 20-women sample is not independent experiments, because choosing the first woman will affect the probability for the coming experiments.

Why the binomial distribution was used here ?

3

There are 3 best solutions below

3
On BEST ANSWER

I think the question is assuming that each individual woman has an 82% chance of getting married, independently of what other women will do.

We aren't choosing a 2- or 3-woman sample, we are merely checking the marital status of 20 women and checking if there happen to be 2 or 3 who are unmarried.


EDIT: Another way of looking at the problem:

Let's say we have a ball pit filled with 1 million balls. 820,000 are blue and 180,000 are red. Therefore, if I pick a ball at random, I have an 82% chance of it being blue and a 18% chance of it being red.

Now, what if draw a blue ball, throw that ball away, and decide I want to draw another one? It's true that the probability distribution has changed, since there are now 819,999 blue balls and 180,000 red balls, with 999,999 total balls. But for simplicity's sake, we can assume the probability distribution it hasn't changed very much (only by ~$10^{-6}$ in fact), so keeping our 82%/18% distribution is still going to be mostly accurate.

If I draw a small number of samples relative to the total number of balls (~20 samples relative to 1 million), the distribution is approximately binomial.

So on a mathematical level, you are correct: the distribution does change when you sample without replacement, but I think the problem wants you to make a simplifying assumption.

0
On

The question is implying that there are enough 25 year old women (and a large enough sample population) to assume that the probability for any given woman is 0.82 and hence the trials are independent.

The probability of each woman being married in their lifetime is an independent Bernouilli trial (or at least assumed to be for the sake of the question), and consequently as this experiment for a given sample forms a sequence of independent Bernouilli trials, the Binomial distribution is quite suitable to use.

0
On

"Two or three women in a sample of twenty" means you went out into the general population and found a woman to put in your sample. Then you did this $19$ more times so that you had a sample of $20$ women.

Now count how many of the women will never marry. The answer $X$ is an integer in the range from $0$ to $20$, inclusive. Moreover, $X = X_1 + X_2 + \cdots + X_{20}$ where $X_n$ is $1$ if the $n$th woman you put in your sample will never marry, $0$ otherwise.

At no time do you need or want to select two or three women from your sample of $20$. You have selected all of the $20$ women from the larger population, and that's the last selection you should make.

(The use of binomial coefficients such as $\binom{20}{2}$ may cause some confusion on this point, because we read $\binom{20}{2}$ as "$20$ choose $2$". But the $\binom{20}{2}$ does not represent any "choice" that you make from your sample; all it does is help you count the possible outcomes of your sample that have the result $X=2$.)

As noted in another answer, if the population from which you sample contains one million women who were unmarried at age $25$, and exactly $820,000$ of those women will eventually marry, then after you have put one woman in your sample there are $999,999$ remaining women to choose from, of whom either $819,999$ or $820,000$ are married, and in neither case is the probability of the next woman eventually marrying exactly $82\%$.

But that is, first of all, needlessly "precise", and secondly, not even a very good interpretation of the survey results. A survey cannot tell you exactly how many women in a population will marry; it is only an estimate. A simpler (for our purposes) interpretation of the survey is that every time a woman is born and reaches the age of $25$ without marrying, she has an $82\%$ chance to marry later. We can assume is true for every woman who reaches the age of $25$ without marrying regardless of what happens to any other woman, and so it is reasonable to say that each woman in the entire population who reaches the age of $25$ without marrying is an independent Bernoulli trial with probability $0.18$ of never marrying. They are still independent Bernoulli trials even when you look only at $20$ of them selected at random.