From a First Year of College Mathematics by Henry Miles,
Chapter 16 review: #17. If 20 per cent of the students in a certain course fail, what is the probability that exactly 2 of 5 students chosen at random will pass?
I understand that the probability of selecting a pass-student is $\frac{4}{5}$, so selecting only 2 passing students out of 5 has a probability of $\frac{4}{5}^2 \cdot \frac{1}{5}^3 = \frac{16}{3125}$. However, the correct answer is $\frac{32}{625}$.
How did he arrive at that answer? I understand that the correct answer is 10 times my answer, where 10 is number of permutations, not all unique:
$\frac{\mathbf{P}(5,5)}{2!3!} = \frac{5!}{2!3!(5-5)!} = 10$
But I don't quite get why you multiply by 10. Is it because there are 10 ways you can arrive at an event with probability of $\frac{16}{3125}$, so you scale by 10, or is this just coincidence? Thanks!
Five hours later, this is to formalize @lulu's Comment, and give some additional examples.
To get an exact answer, you need to know the number of students in the class. With the choice of each student (without replacement), the population changes slightly. This means that choices are not exactly independent, as required by the binomial distribution.
Binomial. If we decide to use the binomial distribution as an approximation, then the number of passing students in a sample of $n = 5$ is $X \sim Binom(5, .8),$ and you want $$P(X = 2) = {5 \choose 2}(.8)^2(.2)^3 = 0.0512.$$ There are ${5 \choose 2} = \frac{5!}{2!3!} = 10$ possible arrangements of P and F among the students chosen: e.g., PPFFF, PFPFF, and so on. By independence, each of these has a probability $(.8)^2(.2)^3$ of occurring. Using R statistical software for the computation, we have:
Because the binomial model is only an approximation, you might hope that the actual answer is about $0.05.$
Hypergeometric. To get an idea how the exact probability would vary depending on the number of students in the class, we can make an assumption about the class size $N$ and use the hypergeometric distribution.
Suppose there are $N = 100$ students in the class; 20 failing and 80 passing. Then the total number of ways to sample 5 students without replacement from among $N$ is ${100 \choose 5}.$ Also, the number of ways to select 2 passing students and 3 failing students is the product ${80 \choose 2}{20 \choose 3}.$ So the probability of getting $Y = 2$ passing students is $$P(Y = 2) = \frac{{80 \choose 2}{20 \choose 3}}{{100 \choose 5}} = 0.047849.$$
We say that $Y$ has a hypergeometric distribution, and you should check your textbook for the exact formulation. In R the computation is:
So the exact probability for a class of 100 students is about 0.05.
Similarly, we can see that the desired probability depends on the size of the class. Two extremes: If $N = 1000$ it is 0.0509, and if $N = 10$ it is 0. You might be surprised by the last answer; there are certainly two P students available for selection, but there aren't three F students in the class.
Below is a bar chart of the hypergeometric distribution for a class of $N=100$ students. The red circles near the tops of the bars show probabilities from $Binom(5, .8).$ The resolution of the graph is just good enough to see that the hypergeometric and binomial probabilities are not exactly the same. (For a perfect match, the tops of the bars should be at the exact centers of the circles.)