How does randomly creating a subset (from a set) affect the probability (that was originally applicable for the item in the set) of the items in it?

88 Views Asked by At

I have the following question from a book.

A lot has 10% defective items. Ten items are chosen randomly from this lot. The probability that exactly 2 of the chosen items are defective is?

And the solution to this is:

The problem can be done using binomial distribution since the population is infinite.

Probability of defective item $p=0.1$. Probability of non-defective item $q=0.9$. Probability that exactly 2 of the chosen items are defective:

$\newcommand\Mycomb[2][^{10}]{{#1\mkern-0.5mu}{}C_{#2}}$ $=\Mycomb{2}p^{2}q^{8}=\Mycomb{2}(0.1)^{2}(0.9)^{8}=0.1937$

Now my doubt is how can the probability of the new random section of 10 items have the same as that of the bigger (infinitely in this case, although I would prefer a general answer) set? Suppose we coincidentally get 10 defective items then the probability of the lot of 10 having defective items would be $100 \%$, right?

Can someone please explain to me if the solution is wrong or I am missing something here?

2

There are 2 best solutions below

1
On BEST ANSWER

As stated, the problem is ambiguous. If the lot size is small then we can't treat each choice of an item as an independent event. To take an extreme case, suppose the lot only had $10$ items, with $1$ defective one. In that case, of course, the probability of drawing two defective ones is $0$.

If we fix the lot size as $N$, then there are $\binom N{10}$ ways to choose $10$ items and $\binom {.1N}{2}\times \binom {.9N}{8}$ ways to choose them so that exactly $2$ are defective. Under this assumption the probability, as a function of $N$ is the quotient $$\frac {\binom {.1N}{2}\times \binom {.9N}{8}}{\binom N{10}}$$

In particular, for $N=100$ we get $0.201509885$. For $N=10^6$ we get $0.193710998$.

It seems reasonable to imagine that we are approaching a limit as $N\to \infty$. Indeed, if we assume that $N$ is very large then it makes sense to regard each draw as an independent event. Under that assumption we are free to think of this as a binomial distribution, with "success" probability $.1$ in which case the answer would be $$\binom {10}2\times .1^2\times .9^8=0.193710245$$

which, unsurprisingly, is more or less the same as the value we got for $N=10^6$. After all, while it is technically true that drawing a defective item (from the $100000$ defectives) lowers the probability that the next one is also defective, it certianly doesn't lower it by much.

As an analogy, suppose we were tossing a biased coin repeatedly, and writing down the results. Here our coin comes up $H$ with probability $.1$ and, of course, each toss is independent of all the others.

2
On

You wrote:

Suppose we coincidentally get 10 defective items then the probability of the lot of 10 having defective items would be 100%, right?

Yes, but what is the probability of "coincidentally" getting ten out of ten defective items? It would be the probability that the first item is defective (which is $0.1$), times the probability that the second item is defective (which is also $0.1$), times the probability that the third item is defective, etc. In the end, you obtain $(0.1)^{10}$ as the probability that the entire sample is defective.

A somewhat more complex question regards the probability of "coincidentally" getting two defective items out of the sample of ten. One way would be to get defective items for the first two (which happens with probability $(0.1)^2$), and then satisfactory items for the remaining eight (which happens with probability $(0.1)^8$), for a total probability of $(0.1)^2(0.9)^8$.

Of course, there are other combinations than just "first two defective, remaining eight satisfactory." How many combinations are there? Well, there are as many as there are ways to choose two items out of ten, which is $\binom{10}{2} = 45$. Each one has the same probability of $(0.1)^2(0.9)^8$, which means the total overall probability is $45(0.1)^2(0.9)^8$.


It should be emphasized that this reasoning only works because the population is infinite. If it were finite, then the probability distribution of the second item in the sample would depend to some extent on whether the first item in the sample was defective or satisfactory.