I've been working my way through an introduction to Bayesian Inference in a Statistical Physics textbook (Tobochnik and Gould, 2010 - available online, excellent book). I've run across a problem that I can't quite wrap my head around, though I believed I understood Bayesian Inference up to that point (just before this was an amazing explanation of the Monty Hall problem using Bayes' Theorem).
What is happening here? I thought the chance of the test being right was 98%? Why would Bayes Theorem tell us that the chance of you actually having the disease given the positive result is less than 1 percent? What does that mean we're saying when we say that the test is 98% accurate? Does it have something to do with the fact that the disease is so rare?
You’ve got it. When the disease is very rare, the probability of a false positive becomes relatively high. One way to see how this might be is to draw a diagram of the four possibilities:
The red and purple areas represent incorrect test results while the blue and purple areas represent people who have the disease. As the blue-purple region gets thinner, i.e., the disease gets rarer, the red area of false positives gets bigger relative to it. If the disease is very rare, it can be much bigger.
This is why, for very rare diseases, testing the general population for the disease can be counterproductive.