Probability of getting genes

96 Views Asked by At

Is the answer to this question wrong?

Consider the three alleles A, B, and O for the human blood types. Suppose that, in a certain population, the frequencies of these alleles are 0.45, 0.20, and 0.35, respectively. Kim and John, two members of the population, are married and have a son named Dan. Kim and Dan both have blood types AB. John’s blood type is B. What is the probability that John’s genotype is BB?

Answer: Clearly, John’s genotype is either BB or BO. Let E be the event that it is BB. Then $E^c$ is the event that John’s genotype is BO. Let F be the event that Dan’s genotype is AB. By Bayes’ formula, the probability we are interested in is

$$\begin{align}P(E|F)&=\frac{P(F|E)P(E)}{P(F|E)P(E)+P(F|E^c)P(E^c)}\\ &=\frac{(1/2)(0.2)^2}{(1/2)(0.2)^2+(1/4)(0.20)(0.35)}\\&=0.53\end{align}$$

Note that $P(E) = (0.20)^2$ and $P(E^c) = (0.20)(0.35)$ since for the two B alleles in John’s blood type, each is coming from one parent independently of the other parent. Remember John is the father.

My question: $E$ and $E^c$ are not even complements of each other!!

2

There are 2 best solutions below

2
On BEST ANSWER

You are correct that events $E$ and $E^C$, as written, are not complements. However, given that we know John has type B blood, they are complements. So how do we reconcile that with the fact that their probabilities don't sum to $1$?

Let $B$ be the event that John has type $B$ blood. Then $P(E \cup E^C \mid B)=1$, in other words, $P(E\mid B) +P(E^C\mid B)=1$. We can update our Bayes calculation by conditioning everything on the event $B$ (which we assume has happened): \begin{align*} P(E \mid F ~\&~ B) &= \frac{P(F \mid E ~\&~ B)P(E \mid B)}{P(F \mid E ~\&~ B)P(E \mid B)+P(F \mid E^C ~\&~ B)P(E^C \mid B)}\\ &=\frac{P(F \mid E)P(E ~\&~ B)/P(B)}{P(F \mid E)P(E ~\&~ B)/P(B)+P(F \mid E^C ~\&~ B)P(E^C ~\&~ B)/P(B)}\\ &=\frac{P(F \mid E)P(E ) }{P(F \mid E)P(E )+P(F \mid E^C ~\&~ B )P(E^C~\&~ B)}\\ \end{align*} which is the expression calculated in the answer. (However, as noted, their calculation of $P(E^C ~\&~ B)$ should have an additional factor of $2$ to account for the fact that it doesn't matter which parent supplies the O allele and which supplies the B allele).

The steps that went into the simplification were:

  • The definition of $P(E\mid B) = P(E ~\&~ B)/P(B)$ (and similarly for $E^C$)
  • $E \subseteq B$, so $E ~\&~ B =E$ (this is not the case for $E^C$)
  • Cancellation of the $1/P(B)$ terms

If you're uncomfortable with the "conditional Bayes formula" that I wrote, then just remember that $P(\cdot \mid B)$ is a probability measure, so we could call it $\tilde{P}(\cdot)$. I.e., when I write $\tilde{P}(A)$ I mean $P(A \mid B)$. Then the conditional Bayes formula is: $$\tilde{P}(E\mid F) = \frac{\tilde{P}(F \mid E)\tilde{P}(E)}{\tilde{P}(F \mid E)\tilde{P}(E)+\tilde{P}(F \mid E^C)\tilde{P}(E^C )}.$$

0
On

My answer appears to differ from the others. Perhaps that's because I am making different assumptions.

As I said in the comments, it appears to me that one needs to know (or to calculate) the frequencies for the various allele pairs.

If we assume independence of the two alleles (Note: I'm not at all sure that is a good assumption):

Then $P(BB)=.2^2=.04$ and $P(OB)=2\times .2\times .35=.14 $

Thus, our prior belief for John's allele type, given that he has blood type $B$ is $$P(BB\,|\,B)=\frac {.04}{.04+.14}=\frac 29$$

$$P(BO\,|\,B)=\frac {.14}{.04+.14}=\frac 79$$

Now we are told that their son has type $AB$. Of course, the probability of $AB, BB$ yielding $AB$ is $\frac 12$ and the probability of $AB,BO$ yielding $BB$ is $\frac 14$. Specifically, we see that the probability that John is $BB$ is $$\frac {\frac 12\times \frac 29}{\frac 12\times \frac 29+\frac 14\times \frac 79}=\frac 4{11}=.\overline {36}$$

Qualitatively: My assumptions lead me to the belief that $BB$ is quite a bit rarer than $BO$, so while $AB$ for the son is certainly evidence that the father was not $BO$ it isn't all that compelling, given the rarity of $BB$. But, of course, my assumptions might be wrong (or I could have blundered in some other way).