Simple explanation for Hypergeometric distribution probability

1.6k Views Asked by At

I am following through the Hypergeometric distribution:

The probability that we select a sample of size $n$ containing $r$ defective items from a population of $N$ items known to contain $M$ defective items is

$P(X = r) = C(M,r) * C(N-M,n-r) / C(N,n)$

where C(P,Q) is the combination of P items taken Q at a time.

Explanation on the above equation:

(a) we may select $n$ items from a population of $N$ items in $C(N,n)$ ways - understood

(b) we may select $r$ defective items from $M$ defective items in $C(M,r)$ ways - understood

(c) we may select $n−r$ non-defective items from $N−M$ non-defective items in $C(N−M,n−r)$ ways -did not understand

(d) hence we may select $n$ items containing $r$ defectives in $C(M,r) * C(N−M,n-r)$ ways -did not understand

Why both (b) and (c) must be considered and those factors got multiplied in (d)

Can anybody explain the hypergeometric distribution derivation in simple terms.

The above material is taken from here : The Hypergeometric distribution

1

There are 1 best solutions below

7
On BEST ANSWER

Let's do it with an example: $N=5$ objects from wich $M=3$ are defective and $N-M=2$ are not defective. $n=3$ items are selected. What is the probability that $r=2$ of them are defective?

Set of objects: $\{D_1,D_2,D_3,N_1,N_2\}$. There are $C(5,3)=10$ ways to take $3$ out of $5$ ((a) understood).

Looking only at defective there are $C(3,2)=3$ ways to take out $2$ ((b) understood). Actually we have the possibilities: $D_1D_2$, $D_1D_3$ and $D_2D_3$.

Looking at non-defectives there are $C(2,1)=2$ ways to take out $1$ ((c) not understood). Actually we have the possibilities: $N_1$ and $N_2$.

That means that we have $3\times2=6$ possibilities for taking out $2$ defectives (and automatically one non-defective) wich are:

  • $D_1D_2N_1$
  • $D_1D_3N_1$
  • $D_2D_3N_1$
  • $D_1D_2N_2$
  • $D_1D_3N_2$
  • $D_2D_3N_2$

The probability that this happens is: $\dfrac{3\times2}{10}$.