What Is the Probability The Second Kid Is a Boy?

96 Views Asked by At

Okay, so I was asked this question in an interview on a machine learning expert position. To be honest, the question itself (and the hint by the interviewer) seemed quite ill-phrased, which probably is the reason I ended up failing the interview, and he thought I must be super dumb. Here is the original question.

You know your colleague has two kids, and also know one of them is a boy. What is the probability that the other one is a boy too?

I was a bit puzzled, then he gave me a hint, by asking me to use Bayes' theorem, which I knew from high school $$\mathbb{P}(A\cap B)=\mathbb{P}(A|B)*\mathbb{P}(B)$$ I could see that given one kid is a boy corresponds to event $B$, but could not really figure out the other quantities.

To confuse matters, he gave me hints like when you see people with two kids, most of the times it is a boy and a girl, right? I could not argue with him, obviously, but I cannot reach any such conclusion based on my personal observation either.

I tried to tell things like

  • to calculate it we need empirical data like survey of all couples having two kids in the city/country etc.
  • absent other information, the second child has the same probability of being a boy as the percentage of males in the country, assuming each kid's gender is independent

But seems she had some assumption about the scenario (that meant the problem can be solved purely mathematically) that I failed to clarify. Upon further thought, there may be some biological concepts on how chromosomes interact to decide the gender of the second kid (and whether it is biased one way or another), but that is hardly fair to expect from an ML engineer. Is that where the answer lies?

But the reason for this post is not to complain, but I am giving the context, just to ask what exactly am I missing in the question assuming it is meant to be a probability (and not biology) question.

2

There are 2 best solutions below

1
On

I find nothing ambiguous about the question. You are told that:

  • Your colleague has two children
  • One of the children is a boy

You also can reasonably assume that

  • Boys and girls are equally likely and comprise the only options for the sex of children
  • The probability of a given child being a boy does not affect the probability of any other child being a boy

The answer follows readily from the above: we enumerate all possible equiprobable outcomes of the sex of two children:

$$(B,B), (B,G), (G,B), (G,G)$$

where the two children are identifiable. You are given that one child is a boy; thus your colleague's children would be one of the three equiprobable outcomes $$(B,B), (B,G), (G,B).$$

Of these, only one case has the other child be a boy, thus the desired probability is $1/3$.


I do not think the question is ambiguous in the matter of whether it should be specified that at least one child is a boy. If I say I have two fair coins, say a quarter and a nickel, and I flip them, observe the result, and then tell you that one of them is heads, that should not automatically imply to you that the other must be tails. If I had said "I obtained one head," that is more ambiguous, because it implies that the total number of heads obtained is one. But saying "one is heads" is not implying anything about the other coin.


It is perhaps counterintuitive that, had the colleague said to you instead "my eldest child is a boy," that the probability of the other child is also a boy is $1/2$, not $1/3$, since in such a case, the only permissible outcomes are $$(B,B), (B,G),$$ if we take the first element in the ordered pair to be the sex of the older child and the second element is the sex of the younger child. Note that $(G,B)$ is no longer permitted because we were told that the eldest child is a boy. This is a well-known paradox in which others disagree about the interpretation of the original question as stated.

0
On

Given your colleague has two kids, and hiddenly adopting binary biological sex (each kid is either a boy (B) or a girl (G)), the sample space is

$$\Omega=\{(G,G), (G,B), (B,G), (B,B) \}$$

with the following probability measure:

$$\mathbb P ((G,G))=p^2, \mathbb P ((G,B))=\mathbb P ((G,B))=p(1-p), \mathbb P ((B,V))=(1-p)^2$$

where $\color{blue}{p}$ denotes the probability that a kid is girl (hiddenly assuming the sex of each kid is independent from each other).

After being informed that one of the two kids is boy, i.e., the event $B_1=\{(G,B), (B,G), (B,B) \}$ , the conditional probability that the other is also boy, i.e., the event $B_2=\{(B,B) \}$, is given by

$$\mathbb P (B_2 | B_1)=\color{blue}{\frac{(1-p)^2}{1-p^2}}.$$

The parameter $\color{blue}{p}$ needs to be estimated based on some data set. As your location is Turkey now, from this Turkish official website, we have

According to birth statistics; the number of babies born alive in 2020 was 1 million 112 thousand 859. 570 thousand 892 of them were boys, and 541 thousand 967 of them were girls. 97.1% of the babies born alive were single births, 2.9% were twins, and 0.1% were triplets or more.

Hence, as the sample size is 1 million 112 thousand 859, a highly accurate estimate of $\color{blue}{p}$ based on 2020 data is $\color{blue}{0.4870041937}$ (note that it very close to 48.7%, the portion of the female child population of Turkey). Hence, the probability is

$$\mathbb P (B_2 | B_1)=0.3449861194.$$

You can see that though it is somehow close to $\frac{1}{3}=0.3\bar{3}$, obtained by assuming $p=\frac{1}{2}$; it is at least 0.01 greater than $\frac{1}{3}$.