Can someone please verify my solutions for this probability question on bayes' theorem?

71 Views Asked by At

Assume a COVID test can identify the presence of COVID, given that the person has COVID, with probability $p_d$. Assume the test assigns false positives with probability $p_f$ (a test will be positive but the person does not have COVID). Let $p_{\theta}$ be the prior probability a person has COVID.

a) Calculate the probability that a test subject has COVID, given a test was positive. Do the same given the test was negative. Calculate the probability the test subject doesn’t have COVID given the test was positive, and then do the same for a negative test.

*** $P(COVID|+)=\frac{P(+|COVID)P(COVID)}{P(+)}$

Using law of total probability for $P(+)$:

$P(+)=P(+|COVID)P(COVID)+P(+|NOCOVID)P(NOCOVID)$

$=p_dp_{\theta}+p_f(1-p_{\theta})$

$\implies P(COVID|+)=\frac{p_{d}p_{\theta}}{p_dp_{\theta}+p_f(1-p_{\theta})}$

*** $P(COVID|-)= \frac{P(-|COVID)P(COVID)}{P(-)}$

$P(-)=P(-|COVID)P(COVID)+P(-|NOCOVID)P(NOCOVID)$

$= (1-p_d)p_{\theta}+(1-p_f)(1-p_{\theta})$

$\implies P(COVID|-)=\frac{(1-p_d)p_{\theta}}{(1-p_d)p_{\theta}+(1-p_f)(1-p_{\theta})}$

*** $P(NOCOVID|+)= 1-P(COVID|+) = 1 - \frac{p_{d}p_{\theta}}{p_dp_{\theta}+p_f(1-p_{\theta})}$

*** $P(NOCOVID|-) = 1 - P(COVID|-) = 1 - \frac{(1-p_d)p_{\theta}}{(1-p_d)p_{\theta}+(1-p_f)(1-p_{\theta})}$

b) Assume $p_d=0.8$ and $p_f=0.5$, and assume that the prior probability any person has COVID is 0.1. Given a person has COVID, how many positive tests in a row do they need to take to be 99% confident they have it? 99.9%?

$1-P(COVID|+)^{n} \geq 0.99$

$\implies 1-\Big[ \frac{(0.8)(0.1)}{(0.8)(0.1)+(0.5)(1-0.1)} \Big]^{n}\geq 0.99$

$\implies 1-(0.15)^n\geq 0.99$

$n\geq 2.4$

Need at least three tests.

For 99.9%, perform the same calculation but with 0.999 instead of 0.99. We get

$1-(0.15)^n \geq 0.999$

$\implies n \geq 3.6$

Need at least 4 tests

1

There are 1 best solutions below

0
On BEST ANSWER

From comments:

Your $\dfrac{p_{d}p_{\theta}}{p_dp_{\theta}+p_f(1-p_{\theta})}$ looks correct.

Your $1 - \dfrac{(1-p_d)p_{\theta}}{(1-p_d)p_{\theta}+(1-p_f)(1-p_{\theta})}$ also looks right though $\dfrac{(1-p_f)(1-p_{\theta})}{(1-p_d)p_{\theta}+(1-p_f)(1-p_{\theta})}$ is perhaps better.

Making the (unrealistic in reality) assumption the successive tests of the same person are independent conditioned on that person's actual status, your answer to (b) looks strange to me in that you are raising the wrong thing to the power $n$. I think you may want to look at something like $\dfrac{p_{d}^n p_{\theta}}{p_d^n p_{\theta}+p_f^n (1-p_{\theta})}$ and get higher numbers. This leads to something solvable for $n$, particularly if you use log-odds and log-likelihood-ratios. Or you can simply evaluate it for $n=1,2,\cdots,20$.

$\dfrac{p_{d}^n p_{\theta}}{p_d^n p_{\theta}+p_f^n (1-p_{\theta})} \ge k$ is the same as $n \ge \dfrac{{\log \frac{k}{1-k} -\log\frac{p_{\theta}}{1-p_{\theta}}}}{\log\frac{p_f}{p_d}}$ and for example with $k=0.99$ I would get $n \ge 14.45$.