Urn problem - Why isn't P(Yellow M&M is from 1994 bag) = P(Bag from 1996|Green M&M)*P(Bag from 1994|Yellow M&M)?

1k Views Asked by At

The blue M&M was introduced in 1995. Before then, the color mix in a bag of plain M&Ms was (30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan). Afterward it was (24% Blue , 20% Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown).

A friend of mine has two bags of M&Ms, and he tells me that one is from 1994 and one from 1996. He won't tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow M&M came from the 1994 bag?

I found this on Allen Downey's blog and read the author's solution. I understand that P(Yellow M&M came from 1994 bag) isn't equal to P(Bag from 1994|Yellow M&M) = 0.588 because we have additional information that a green M&M was selected from the other bag.

However, if the probability of selecting a M&M from either bag is independent (and we know the M&Ms didn't come from the same bag), why can't we calculate P(Bag from 1996|Green M&M)*P(Bag from 1994|Yellow M&M) = 0.666*0.588? This gives us 0.392, which is not the correct answer. The author's solution is provided below for reference.

Hypotheses:

  • A: Bag #1 from 1994 and Bag #2 from 1996
  • B: Bag #2 from 1994 and Bag #1 from 1996

Again, P(A) = P(B) = 1/2.

The evidence is: E: yellow from Bag #1, green from Bag #2

We get the likelihoods by multiplying the probabilities for the two M&M:

P(E|A) = (0.2)(0.2) P(E|B) = (0.1)(0.14)

For example, P(E|B) is the probability of a yellow M&M in 1996 (0.14) times the probability of a green M&M in 1994 (0.1).

Plugging the likelihoods and the priors into Bayes's theorem, we get P(A|E) = 40 / 54 ~ 0.74

By introducing the terms Bag #1 and Bag #2, rather than "the bag the yellow M&M came from" and "the bag the green came from," I avoided the part of this problem that can be tricky: keeping the hypotheses and the evidence straight.

1

There are 1 best solutions below

2
On BEST ANSWER

However, if the probability of selecting a M&M from either bag is independent ...

Boop.   The probabilities for selecting those M&M from the bags is not independent.   They are conditionally independent for a given arrangement of the two bags.

why can't we calculate P(Bag from 1996|Green M&M)*P(Bag from 1994|Yellow M&M) = 0.666*0.588?

Your expectation that a bag is from 1994 will be influenced both by knowing that you drew a yellow M&M from it and by knowing that you drew a green M&M from the other bag.

Plus, for any given evidence, the event that this bag is from 1994 will not be independent of the event that that bag is from 1996.   They are actually the same event after all.


You have the events $Y, G, A, B$ for selecting yellow from the first bag, green from the second, the first bag is from 1994, the second bag is from 1994.   $A, B$ are mutually exclusive and exhaustive, and $Y,G$ are conditionally independent for each event $A,B$, so by using the definition for conditional probability, and the Law of Total Probability:

$$\begin{split}\mathsf P(A\mid Y,G) & = \dfrac{\mathsf P(Y,G\mid A)\mathsf P(A)}{\mathsf P(Y,G)} \\ &= \dfrac{\mathsf P(Y\mid A)\mathsf P(G\mid A)\mathsf P(A)}{\mathsf P(Y\mid A)\mathsf P(G\mid A)\mathsf P(A)+\mathsf P(Y\mid B)\mathsf P(G\mid B)\mathsf P(B)}\\ &= \dfrac{\mathsf P(Y\mid A)\mathsf P(G\mid A)}{\mathsf P(Y\mid A)\mathsf P(G\mid A)+\mathsf P(Y\mid B)\mathsf P(G\mid B)} &\because \mathsf P(A)=\mathsf P(B) \\ \text{Similarly}\\ \mathsf P(A\mid Y) & = \dfrac{\mathsf P(Y\mid A)}{\mathsf P(Y\mid A)+\mathsf P(Y\mid B)}\\\mathsf P(A\mid G) & = \dfrac{\mathsf P(G\mid A)}{\mathsf P(G\mid A)+\mathsf P(G\mid B)} \end{split}$$