Context: Second year uni Bayesian statistics question
Question: Suppose only $0.1$% of new treatments are actually effective. Of all tests in which a new treatment is declared effective, what proportion of those results are actually false positives?
Additional info from question:
- $H_0$: New treatment is ineffective
- $H_1$: New treatment is effective
- Significance level = $\alpha$ = $0.05$
- Power = $1-\beta$ = $0.8$
Attempt:
Let $A$: $H_0$ is accepted (so $A^C$: $H_0$ is rejected). Let $B$: $H_0$ is true (so $B^C$: $H_0$ is false). Also, $\alpha$ = $\mathbb{P}(A^C|B)$ = $0.05$ and $1-\beta$ = $\mathbb{P}(A^C|B^C)$ = $0.8$. Since $0.1$% of new treatments are effective, $\mathbb{P}(B^C)=0.1$. However, I am unsure of how to translate the question into a probability statement, or whether it would be considered a union or conditional probability of any combination of $A,B,A^C,B^C$. Any help would be greatly appreciated.
Let $D$ be the event that a treatment is declared effective, and $\bar D$ the complementary event that a treatment is declared ineffective. Let $E$ be the event that the treatment is truly effective, and $\bar E$ the complementary event the treatment is truly ineffective.
Then the given information is translated into the following probability statements:
$$\Pr[E] = 0.001 \\ \Pr[D \mid \bar E] = 0.05 \\ \Pr[D \mid E] = 0.8.$$
The first equation simply states that the probability of a treatment being truly effective is $0.001$. The second equation asserts that the probability of a truly ineffective treatment being declared effective (i.e., Type I error) is $0.05$. The third equation asserts that the probability of a truly effective treatment being correctly identified as effective is $0.8$--this is the power.
We want to compute $\Pr[\bar E \mid D]$, the probability of the treatment being truly ineffective, given that it is declared effective--this is the "false positive" case. To this end, we simply use Bayes' theorem:
$$\Pr[\bar E \mid D] = \frac{\Pr[D \mid \bar E]\Pr[\bar E]}{\Pr[D]}.$$ The numerator is straightforward, since $\Pr[\bar E] = 1 - \Pr[E]$. The denominator needs additional conditioning via the law of total probability:
$$\Pr[D] = \Pr[D \mid \bar E]\Pr[\bar E] + \Pr[D \mid E]\Pr[E].$$ All that is left is to substitute and evaluate.
As you can see, the key to solving the question is to understand how to interpret the conditions on the hypothesis test into equivalent statements about the relationship between what is declared about the treatment's efficacy versus what is actually true about the treatment's efficacy. Once you see this, structuring the computation using conditional probabilities becomes familiar.
You will find that the false positive rate of this test is alarmingly high; so high, in fact, as to render such declarations of efficacy practically worthless. Why is this so, when our Type I error tolerance is $0.05$ and our power is $80\%$? After all, these are quite common choices for many hypothesis tests and sample size calculations. The reason is because $E$ is a rare event. The prevalence of effective treatments is substantially lower than the tolerance for error. Consequently, when the result of a test of efficacy of a treatment is positive, it is far more likely that this was a result of random chance--Type I error--than it was due to the treatment actually being effective. This is why, given that a treatment is declared effective, it is overwhelmingly likely that this is due to Type I error, rather than actual efficacy.
In epidemiological and biostatistical contexts, then, when we are faced with diagnosis of a rare disease or condition, the diagnostic criteria must be highly accurate: the tolerance for Type I error must be extremely low, or else you risk polluting the true detection rate with false positive signals. The Bayesian framework helps us quantify this issue.