Calculate the probability using Bayes' rule

259 Views Asked by At

Case
Scientists find that bacteria alpha occurs at $25\%$ of rainforests. If bacteria alpha is in fact present, there is a $50\%$ chance of detecting it in a search. Three searches fail to detect the bacteria alpha.

Questions
1) What is the probability that all three searches will fail to detect the bacteria, if in fact it is present?

Bayes' rule is given as $$Pr(H_1|D)=\frac{P(D|H_1)*P(H_1)}{\sum P(D|H_j)*P(H_j)}$$

2) If the bacteria is recorded as absent in each of three searches, use Bayes' rule to calculate the posterior probability that it is in fact present at the site.

.

My attempt

For question 1, since there is a $50\%$ chance of detecting the bacteria, therefore there is also a $50\%$ of not detecting the species when it is present. Therefore

  • $Pr$(not detecting when present)= $(1-0.5)^3$=0.125

Is this correct?

For question 2, I have no idea, can someone please show the working out for this question?

Thankyou for taking the time to read this

3

There are 3 best solutions below

3
On BEST ANSWER

Original, incorrect, analysis

I'm not completely certain on what posterior probability means, but let's reason through this:

  1. The search site definitely has the bacteria
  2. It is given that the probability of detection is $50\%$ given that the bacteria is present (which is the parameters of this case). Therefore, indeed, it should be $0.5^3 = 0.125 = 12.5\%$.
  3. You take a random sample (of the some place which may or may not contain the bacteria) and you don't detect the bacteria here. There are two cases:
    1. The bacteria is present: $25\%$
    2. You don't detect it with a probability of $0.125$
    3. The baceteria is not present: $75\%$
    4. You don't detect the bacteria, assuming with a probability of $100\%$ given that the bacteria is not there.
    Combining those two gives a probability of the bacteria being there as: $$ 0.25 * 0.125 + 0.75 * 1 = 0.78125 = 78.125\% $$

For me the problem is that a false positive is not given (which is necessary). Without a probability for false positive, I am assuming the probability of a false positive is $0\%$.

Correct Analysis (from comments)

\begin{align} H_1 = &\text{no bacteria is present} \\ H_2 = &\text{bacteria is present} \\ D = & \text{three searches find no bacteria} \\ p(H_1 | D) =& \frac{p(D|H_1)*p(H_1)}{p(D|H_1)*p(H_1) + p(D|H_2)*p(H_2)} \end{align}

We know $p(D|H_1) = 1$ (you said there were no false positives), $p(H_1) = 0.75$ ($75\%$ of rain forests contain bacteria), $p(D|H_2) = 0.125$ (probability of three false negatives give bacteria exists), and $p(H_2) = 0.25$ ($25\%$ of rain forests contain the bacteria).

This gives:

$$ p(H_1 | D) = \frac{0.75}{0.75 + 0.125 * 0.25} = 0.96 = 96\% $$

I mispoke on my original answer (I implied the probability of there not being bacteria as fairly low--which is clearly counterintuitive--but I actually calculated the probability that there was bacteria). If we change the above to calculate that we get:

$$ \frac{0.125 * 0.25}{0.125 * 0.25 + 1*0.75} = 0.04 = 4\% $$

Explain why my original analysis is incorrect

Obviously what I did originally was wrong. There are things wrong with my original analysis...I will leave it up to you and commenters to explain them. Honestly, I think it will be helpful if I leave the incorrect analysis up.

1
On

Your answer to question 1 looks correct.

One possible approach to question 2 might be to consider $1$ million rainforests.

  • How many do you expect to contain bacterium Alpha?
  • How many do you expect not to contain bacterium Alpha?
  • How many do you expect to contain bacterium Alpha but not to have bacterium Alpha detected in three searches?
  • How many do you expect not to contain bacterium Alpha and not to have bacterium Alpha detected in three searches?
  • How many do you expect in total not to have bacterium Alpha detected in three searches?
  • What proportion do you expect of those which do not have bacterium Alpha detected in three searches actually contain bacterium Alpha?
0
On

Let $X$ represent the number of times the bacteria is detected in $n$ searches, and $P$ represent the event that the bacteria is present. Then the variable $X \mid P$ is binomial with parameters $n = 3$ and $p = 0.5$; and $X \mid \bar P$ is a degenerate distribution with $\Pr[X = 0 \mid \bar P] = 1$; that is to say, if no bacteria are present, we assume that the test will never detect its presence (no false positives). Then question 1 asks for $$\Pr[X = 0 \mid P]$$ which is simply a binomial probability.

For question 2, we want $$\Pr[P \mid X = 0] = \frac{\Pr[X = 0 \mid P]\Pr[P]}{\Pr[X = 0]}.$$ The numerator is easy to calculate, but the denominator is a bit trickier. The idea is to use the law of total probability and condition on whether the bacteria is present; i.e., $$\Pr[X = 0] = \Pr[X = 0 \mid P]\Pr[P] + \Pr[X = 0 \mid \bar P]\Pr[\bar P].$$