I am trying to solve the following problem from Richard McElreath's Statistical Rethinking book:
:
Suppose there are two species of panda bear. Both are equally common in the wild and live in the same places. They look exactly alike and eat the same food, and there is yet no genetic assay capable of telling them apart. They differ however in their family sizes. Species A gives birth to twins 10% of the time, otherwise birthing a single infant. Species B births twins 20% of the time, otherwise birthing singleton infants. Assume these numbers are known with certainty, from many years of field research. Now suppose you are managing a captive panda breeding program. You have a new female panda of unknown species, and she has just given birth to twins. What is the probability that her next birth will also be twins?
:
Let $Next$ and $Current$ be the events that the next birth is a twin, and the current birth is a twin, respectively. What I am trying to find is $P(Next|Current)$, which should just be:
$P(Next|Current) = \frac{P(Current|Next)*P(Next)}{P(Current)}$
This is tricky to compute, because we should take into account what the species is. The species can only be A or B, which are equally likely. So
$P(A|Current) = P(Current|A)*P(A)/P(Current) = 0.1*0.5/(0.2*0.5 + 0.1*0.5)$
$ = 1/3$,
whereas $P(B|Current) = 1 - P(A|Current) = 2/3$
Great. Now how do I compute the left hand side of the original expression (i.e., $P(Next|Current)$)? Should I take into account the new values for $P(A)$ and $P(B)$ when I am plugging in values for $P(Current)$ and $P(Next)$, or do I use the old ones?
As a practical matter, the question has huge logical holes.
First, if it's so difficult to distinguish the two species, how did the field biologists determine which species was having twins more often?
Second, if the field biologists know so much about the pandas, why haven't we asked them for the relative sizes of the populations of species $A$ and species $B$ in order to get a better prior $P(A)$ than simply $\frac12$? Or did we ask and the answer was that the populations are equal in size?
Third, how will the panda become pregnant in order for there to be a "next birth"? Presumably you're going to let her mate with a male in your program. Do you know the species of the male? If not, why not? Does it matter? If so, how? If not, why not?
Setting aside those practical objections, if we assume the prior probabilities of the new panda's species are equal, and we assume the process by which the panda will become pregnant the next time is as mysterious as the process by which it happened the last time, then $P(A) = P(B),$ $P(Next) = P(Current),$ and $P(Current \mid Next) = P(Next \mid Current).$ That is, the equation
$$P(Next\mid Current) = \frac{P(Current\mid Next)P(Next)}{P(Current)},$$
while true, does not help us at all.
Since the species of the panda ($A$ or $B$) is the only thing mentioned in the problem that has a known effect on the probability of the next birth being twins, I don't see how you can avoid mentioning the question of what the panda's species is. You can do the calculation a little differently than shown in the solution on the author's web page, however:
\begin{align} P(Next\mid Current) &= \frac{P(Current\cap Next)}{P(Current)} \\ &= \frac{P(Current\cap Next\cap A) + P(Current\cap Next\cap B)} {P(Current\cap A) + P(Current\cap B)} \\ &= \frac{P(Current\cap Next \mid A)P(A) + P(Current\cap Next \mid B)P(B)} {P(Current\mid A)P(A) + P(Current\mid B)P(B)} \\ &= \frac{(0.1\times 0.1)\times\frac12 + (0.2\times 0.2)\times\frac12} {0.1\times\frac12 + 0.2\times\frac12} \\ &= \frac{0.025}{0.15} \\ &= \frac16. \\ \end{align}