Case
Scientists find that bacteria alpha occurs at $25\%$ of rainforests. If bacteria alpha is in fact present, there is a $50\%$ chance of detecting it in a search. Three searches fail to detect the bacteria alpha.
Questions
1) What is the probability that all three searches will fail to detect the bacteria, if in fact it is present?
Bayes' rule is given as $$Pr(H_1|D)=\frac{P(D|H_1)*P(H_1)}{\sum P(D|H_j)*P(H_j)}$$
2) If the bacteria is recorded as absent in each of three searches, use Bayes' rule to calculate the posterior probability that it is in fact present at the site.
.
My attempt
For question 1, since there is a $50\%$ chance of detecting the bacteria, therefore there is also a $50\%$ of not detecting the species when it is present. Therefore
- $Pr$(not detecting when present)= $(1-0.5)^3$=0.125
Is this correct?
For question 2, I have no idea, can someone please show the working out for this question?
Thankyou for taking the time to read this
Original, incorrect, analysis
I'm not completely certain on what posterior probability means, but let's reason through this:
- The search site definitely has the bacteria
It is given that the probability of detection is $50\%$ given that the bacteria is present (which is the parameters of this case). Therefore, indeed, it should be $0.5^3 = 0.125 = 12.5\%$.
- You take a random sample (of the some place which may or may not contain the bacteria) and you don't detect the bacteria here. There are two cases:
- The bacteria is present: $25\%$
You don't detect it with a probability of $0.125$
- The baceteria is not present: $75\%$
You don't detect the bacteria, assuming with a probability of $100\%$ given that the bacteria is not there.
Combining those two gives a probability of the bacteria being there as: $$ 0.25 * 0.125 + 0.75 * 1 = 0.78125 = 78.125\% $$For me the problem is that a false positive is not given (which is necessary). Without a probability for false positive, I am assuming the probability of a false positive is $0\%$.
Correct Analysis (from comments)
\begin{align} H_1 = &\text{no bacteria is present} \\ H_2 = &\text{bacteria is present} \\ D = & \text{three searches find no bacteria} \\ p(H_1 | D) =& \frac{p(D|H_1)*p(H_1)}{p(D|H_1)*p(H_1) + p(D|H_2)*p(H_2)} \end{align}
We know $p(D|H_1) = 1$ (you said there were no false positives), $p(H_1) = 0.75$ ($75\%$ of rain forests contain bacteria), $p(D|H_2) = 0.125$ (probability of three false negatives give bacteria exists), and $p(H_2) = 0.25$ ($25\%$ of rain forests contain the bacteria).
This gives:
$$ p(H_1 | D) = \frac{0.75}{0.75 + 0.125 * 0.25} = 0.96 = 96\% $$
I mispoke on my original answer (I implied the probability of there not being bacteria as fairly low--which is clearly counterintuitive--but I actually calculated the probability that there was bacteria). If we change the above to calculate that we get:
$$ \frac{0.125 * 0.25}{0.125 * 0.25 + 1*0.75} = 0.04 = 4\% $$
Explain why my original analysis is incorrect
Obviously what I did originally was wrong. There are things wrong with my original analysis...I will leave it up to you and commenters to explain them. Honestly, I think it will be helpful if I leave the incorrect analysis up.