Good evening, everyone. I'm currently reading Davy Paindaveine & Philippe Spindel (2023) Revisiting the Name Variant of the Two-Children Problem, The American Statistician, 77:4, 401-405, DOI: 10.1080/00031305.2023.2173293, and, working through some of the preliminary probabilities, I got stuck on something I can't resolve:
The second variant rather asks: for a two-children family having at least a girl whose name is Florida, what is the probability that the other child is a boy? See, for example, Mlodinow (2008) or Marks and Smith (2011). If two sisters may be given the same name, then this variant is strictly equivalent to the previous one: more precisely, if it is assumed that girls are independently named Florida with probability r, then the probability that the other child is a boy is 2/(4 − r). To make the second variant of interest, one therefore needs to assume that two sisters may not be given the same name, in which case, under the assumptions associated with what we will call Model A below, the probability that the other child is a boy is 1/2, irrespective of r
I'm trying to reason through the italicized part of the second variant as follows. The siblings are listed in order where $B$ is a boy, $G_F$ is a girl named Florida, and $G_F^C$ is a girl not named Florida:
$$\frac{P\{(B,G_F)\cup(G_F,B)\}}{P\{(B,G_F)\cup(G_F,B)\cup(G_F^C,G_F)\cup(G_F,G_F^C)\}}=\frac{2\times.5^2r}{2\times .5^2r+P\{(G_F^C,G_F)\cup(G_F,G_F^C)\}}$$
Now, where I'm getting stuck is resolving $P\{(G_F^C,G_F)\cup(G_F,G_F^C)\}$. If I reason as follows, I can get the right answer:
- The probability neither sister is named Florida is $(1-r)^2$
- So the probability at least one sister is named Florida is $1-(1-r)^2$
- But I don't want to count the probability both are named Florida
- So I subtract $r^2$, leaving $1-(1-r)^2-r^2=2r$
- So the probability of two girls, one named Florida, is $.5^2\times 2r$.
What's confusing me is that I also feel like I should be able to do it this way:
$$P\{(G_F^C,G_F)\cup(G_F,G_F^C)\}=P(G_F^C,G_F)+P(G_F,G_F^C)=P(G_F^C|G_F)P(G_F)+P(G_F|G_F^C)P(G_F^C)$$ $$=.5^2[1r+r(1-r)]=.5^2[2r-r^2]$$
Where is this $-r^2$ term coming from? What's weird is that I was able to confirm that, reasoning this way, $P(G_F^C,G_F)+P(G_F,G_F^C)+P(G_F^C,G_F^C)=.5^2(1)$. Somehow the $(G_F,G_F)$ event has probability 0 like it's supposed to, but the probability of "exactly one sister named Florida" is off by exactly what $P(G_F,G_F)$ would've been if possible.
I feel like I'm missing something fundamental here. The only explanation I have is that, as soon as we defined an independent probability $r$ of a girl being named Florida, we lost the right to treat "two sisters named Florida" as an impossibility excluded from the overall sample space. That would necessitate the first strategy I used, while invalidating my $P(G_F|G_F^C)=.5r$ and $P(G_F^C|G_F)=.5(1)$ conditionals in the second strategy. "May not be given the same name" appears to mean "exclude the $(G_F,G_F)$ event from consideration", and not "$P(G_F,G_F)=0$". However, even if this is correct, I still can't explain the fact that the second strategy gave a complete probability distribution.
Can anyone shed some light on this?