Is it possible to solve this Bayesian problem with the given data?

1.1k Views Asked by At

Approximately $1$ in $14$ men over the age of $50$ has prostate cancer. The level of 'prostate specific antigen' (PSA) is used as a preliminary screening test for prostate cancer.

$7\%$ of men with prostate cancer do not have a high level of PSA. These results are known as 'false negatives'.

$75\%$ of those men with a high level of PSA do not have cancer. These results are known as 'false positives'.

If a man over $50$ has a normal level of PSA, what are the chances that he has prostate cancer?

This is how I interpret the problem: Our sample space is the population of men over $50$. If $C$ denotes the statement that an individual has cancer and $N$ denotes the statement that his/her PSA level is normal, we know that:

$$P(N \mid C) = 0.07$$ $$P(\neg C \mid \neg N)= 0.75$$ $$P(C) = \frac{1}{14}$$

The problem asks us to find $$P(C \mid N)$$

Using the Bayes' theorem: $$P(C\mid N)={\frac {P(N\mid C)\,P(C)}{P(N\mid C)P(C)+P(N\mid \neg C)P(\neg C)}}$$

The problem is that we don't know $P(N\mid \neg C)$. I tried to find it using set theoretic relations but I couldn't get rid of $P(N)$.

Is it possible to solve this problem or the given information is insufficient?

2

There are 2 best solutions below

3
On BEST ANSWER

Sometimes the applications of Bayesian equations can be tricky. It is particularly tricky here since the numbers for false negatives and false positives are differently structured. In this case, we can just apply some logic. There are 4 possibilities here. There is a true positive (cancer+elevated levels), a false positive (no cancer+elevated levels), a false negative (cancer + normal level), and a true negative (no cancer, normal levels). Give the probabilities of these the names $P_1$, $P_2$, $P_3$, and $P_4$. There are four equations relating these. $$P_1+P_2+P_3+P_4=1$$ $$P_1+P_3=1/14$$ $$\frac{P_1}{P_1+P_3}=0.07$$ $$\frac{P_2}{P_1+P_2}=0.75$$ You can convert these to linear equations, and they will not be inconsistent. Then you can solve for whichever combination you want.

0
On

The information is sufficient. To solve it, one straightforward method is to let

  • $a$ be the fraction of men with normal PSA and cancer,
  • $b$ be the fraction of men with normal PSA and no cancer,
  • $c$ be the fraction of men with high PSA and cancer,
  • $d$ be the fraction of men with high PSA and no cancer. Then the information given is $$}$.

From the first and third of these you get $$ 14 a = .07 \implies a= \frac1{200} \\ c = \frac1{14}-\frac1{200} = \frac{93}{1400} \\d = \frac34 d + \frac34 \frac{93}{1400}\implies d= \frac{279}{1400} \\ b= \frac{13}{14}-d \implies b=\frac{1021}{1400} \\ a+b = \frac{1028}{1400} \implies \frac{a}{a+b} = \frac{7}{1028} $$