How to use bayes rule to solve the bag problem from Judea Pearl's Book

117 Views Asked by At

I am currently reading Chapter 3 from "The Book of Why", by Judea Pearl, and came accross an interesting involving applications of Bayes' Rule. It is as follows:

"Suppose you’ve just landed in Zanzibar after making a tight connection in Aachen, and you’re waiting for your suitcase to appear on the carousel. Other passengers have started to get their bags, but you keep waiting… and waiting… and waiting. What are the chances that your suitcase did not actually make the connection from Aachen to Zanzibar? The answer depends, of course, on how long you have been waiting. If the bags have just started to show up on the carousel, perhaps you should be patient and wait a little bit longer. If you’ve been waiting a long time, then things are looking bad.

Let's say that all the bags at Zanzibar airport get unloaded within ten minutes and that the probability your bag made the connection (the bag is on the plane) is 0.5.

See Table of conditional probabilities

This table, though large, should be easy to understand. The first eleven rows say that if your bag didn’t make it onto the plane (bag on plane = false) then, no matter how much time has elapsed, it won’t be on the carousel (carousel = false). That is, P(carousel = false | bag on plane = false) is 100 percent. That is the meaning of the 100s in the first eleven rows. The other eleven rows say that the bags are unloaded from the plane at a steady rate. If your bag is indeed on the plane, there is a 10 percent probability it will be unloaded in the first minute, a 10 percent probability in the second minute, and so forth. For example, after 5 minutes there is a 50 percent probability it has been unloaded, so we see a 50 for P(carousel = true | bag on plane = true, time = 5). After ten minutes, all the bags have been unloaded, so P(carousel = true | bag on plane = true, time = 10) is 100 percent. Thus we see a 100 in the last entry of the table.

The most interesting thing to do with this Bayesian network, as with most Bayesian networks, is to solve the inverse-probability problem: if x minutes have passed and I still haven’t gotten my bag, what is the probability that it was on the plane?" (Judea Pearl, The Book of Why, p.118-121)

The correct answer is approximately 47%, but I couldn't solve for it by hand. How can I reach the same answer by calculation?

Thanks

1

There are 1 best solutions below

0
On

We need the inverse probability that the bag was on the plane, given that it had not shown up on the carousel after $x$ minutes. In other words, we want the probability $P(\text{bag on plane=T}\; |\; \text{bag on Carousel = F, time = x})$

According to the Bayes theorem,

$P(\text{bag on plane=T}\; |\; \text{bag on Carousel = F, time = x}) = $ $$\frac{P(\text{bag on carousel = F, time = x | bag on plane = T)} *\: P(\text{bage on plane = T})}{P\text{(bag on carousel = F)}} \:\:\:\:\:(1)$$

Following Pearl, let's evaluate this expression at $x$ = 5 minutes.

The numerator in equation (1) is the product of a conditional probability and a prior probability. The conditional probability is given by Table 3.3, corresponding to the row 'bag on plane = True', 'time elapsed = 5' (i.e., x =5), and the column 'carousel = false'. This value is 50 percent after 5 minutes if the bag was indeed on the plane. By assumption, the prior probability of the bag being on the plane is 50 percent (or p = 0.5).

So, the numerator of the expression on the right-hand side of equation (1) is: $$50 * 0.5 = 25 $$

The denominator $P(\text{bag on carousel =F})$ is the probability of two mutually exclusive events:

Either the bag was on the plane but it had not arrived on the carousel by time = $x$,

OR

The bag was not on the carousel because it was not on the plane!

The probability of the first event is the same conditional probability that was calculated for the numerator: $P(\text{bag on carousel = F at time = x | bag on plane=T}) * P(\text{bag on plane = T})$. For time = 5, the chance of this event is already calculated above as being $50*0.5 = 25$ percent.

Likewise, the probability of the second event is $$P(\text{bag on carousel = F at time = x | bag on plane=F}) * P(\text{bag on plane = F})$$

This chance of this event occurring can be calculated as $$100 * 0.5$$ where the former number is 100 percent because conditional on the bag not being on the plane, the probability of it not being on the carousel is 100 percent! (i.e., p= 1). The latter term in the product is, as before, the prior probability (0.5) of the bag not being on the plane.

These are the only two ways the bag could not appear on the carousel. Therefore,

$P(\text{bag on carousel = F})$ = $$P(\text{bag on carousel = F at time = x | bag on plane=T}) * 0.5$$ $$ + P(\text{bag on carousel = F at time = x | bag on plane=F}) * 0.5$$

$$= 50*0.5 + 100 * 0.5 = 75$$

Thus the denominator of the expression on the right-hand side of (1) is 75.

Now that we have both the numerator and denominator of the expression on the right-hand side of the equation (1), we can compute the probability

$P(\text{bag on plane=T}\; |\; \text{bag on Carousel=F, time=5})$ = $$\frac{25}{75} = 0.33$$ or 33 percent.

According to Judea Pearl, "After five minutes, the probability drops to 33 percent". Likewise, one could calculate this probability after the first minute (x = 1), which is 47 percent.