Bayes Theorem/Law of total probability question.

410 Views Asked by At

I'm having a hard time building intuition behind some Bayes Theorem/Law of total probability problems and understanding why my attempts are incorrect in the first place, for this question in particular:

Question: It is believed that one percent of children have autism. A test for autism has been developed whereby 90% of autistic children are correctly identified as having autism but 3% of non-autistic children are incorrectly identified as having autism. A child is tested at two independent clinics. What is the probability that the two clinics have the same diagnosis?

Attempt at solving the problem:

Let $A$ be the event that a child has autism, and $B$ the event where a child is tested positive for autism (in a clinic).

We are given:

$P(A)=0.01$ (one percent of children have autism).

$P(B|A)=0.9$ and $P(B|\overline{A})=0.03$ (corresponds to 90% of autistic children having a positive test result and 3% of non-autistic children having a positive test result)

Let $1$ and $2$ denote the clinics for which a child is tested. Then the probability that both clinics yield the same diagnosis is:

$P((B_{1}\cap B_{2})\cup(\overline{B_{1}}\cap \overline{B_{2}})) = P(B_{1}\cap B_{2}) + P(\overline{B_{1}}\cap \overline{B_{2}})$(*)

where $B_i=B$ for $i=1,2$ (This should follow from the fact that both clinics receive the same probabilities for diagnosing a child)

At this point I had two different ways to approach the problem, I could either express (*) as:

$P(B_{1}\cap B_{2}) + P(\overline{B_{1}}\cap \overline{B_{2}}) = P(A)P(B_{1}\cap B_{2}|A)+P(\overline{A})P(B_{1}\cap B_{2}|\overline{A})+P(A)P(\overline{B_{1}}\cap \overline{B_{2}}|A)+P(\overline{A})P(\overline{B_{1}}\cap \overline{B_{2}}|\overline{A})$

(follows from the law of total probability)

(Knowing $B_1$ and $B_2$ are independent, it follows I can express $P({B_{1}}\cap {B_{2}}|A)$ as $P({B_{1}}|A)P({B_{2}}|A)$, where this approach leads me to the 'correct' answer)

or I could compute $P(B)=P(A)P(B|A)+P(\overline{A})P(B|\overline{A})$ and express (*) as:

$P(B_{1}\cap B_{2}) + P(\overline{B_{1}}\cap \overline{B_{2}}) = P(B_{1})P(B_{2})+P(\overline{B_{1}})P(\overline{B_{2}})$

(this follows from the clinics being independent and computing B from the law of total probability)

Why does computing the probability in this second approach lead to me to the wrong answer?

2

There are 2 best solutions below

0
On BEST ANSWER

For your second method, if we were to expand just one of the two terms we would actually get: $\begin{align}P(B_1\cap B_2) & = P(A_1)P(B_1\mid A_1)P(A_2\mid A_1)P(B_2\mid A_1\cap A_2) \\ & + P(A_1)P(B_1\mid A_1)P(\bar A_2\mid A_1)P(B_2\mid A_1\cap\bar A_2) \\ & + P(\bar A_1)P(B_1\mid \bar A_1)P(A_2\mid \bar A_1)P(B_2\mid \bar A_1\cap A_2) \\ & + P(\bar A_1)P(B_1\mid \bar A_1)P(\bar A_2\mid \bar A_1)P(B_2\mid \bar A_1\cap\bar A_2)\end{align}$

Where $A_1$ is the event of having autism when visiting the first clinic, and $A_2$ is the event of having autism when visiting the second clinic.

But wait!   It is the same individual, so $A_2$ certainly happens whenever $A_1$ does, and $\bar A_2$ whenever $\bar A_1$.

Hence, after removing redundancies, everything simplifies to being your first attempt:

$$\begin{align}P(B_1 \cap B_2 )+P(\bar B_1 \cap \bar B_2) & = P(B_1 \cap B_2\mid A )+ P(B_1 \cap B_2\mid \bar A )+P(\bar B_1 \cap \bar B_2\mid A)+ P(\bar B_1 \cap \bar B_2\mid A) \\[2ex] & = P(A)P(B_1\mid A) P( B_2\mid A )+ P(\bar A)P(B_1\mid \bar A \cap B_2\mid \bar A ) \\ & \quad +P(A)P(\bar B_1\mid A)P(\bar B_2\mid A)+ P(\bar A)P(\bar B_1 \mid \bar A)P(\bar B_2\mid A) \end{align}$$

That is it.


tl;dr: You can't claim complete independence, $P(B_1\cap B_2)= P(B_1)P(B_2)$, when the test results are known to be for the same individual.   At best you can only claim conditional independence, $P(B_1\cap B_2\mid A)=P(B_1\mid A)P(B_2\mid A)$

0
On

Let $D$ be the occurrence of being diagnosed with autism and $A$ be the state of having autism.

One might be tempted to compute the probability that a diagnosis is the same at both clinics with $$ \begin{align} &P(D)^2+P(\lnot D)^2\\ &=\left[P(D\mid A)P(A)+P(D\mid\lnot A)P(\lnot A)\right]^2 +\left[P(\lnot D\mid A)P(A)+P(\lnot D\mid\lnot A)P(\lnot A)\right]^2\\ &=\left[0.9\times0.01+0.03\times0.99\right]^2 +\left[0.1\times0.01+0.97\times0.99\right]^2\\ &=0.92559538 \end{align} $$ However, this is the probability that the diagnosis for a random child at the first clinic is the same as the diagnosis for a (possibly different) random child at the second clinic. This is not what the question asks.

The probability that a random child has the same diagnosis at both clinics is $$ \begin{align} &\overbrace{\left[P(D\mid A)^2+P(\lnot D\mid A)^2\right]}^{\text{same given autistic}}P(A) +\overbrace{\left[P(D\mid\lnot A)^2+P(\lnot D\mid\lnot A)^2\right]}^{\text{same given not autistic}}P(\lnot A)\\ &=\left[0.9^2+0.1^2\right]\times0.01+\left[0.03^2+0.97^2\right]\times0.99\\ &=0.940582 \end{align} $$