Boy Born on a Tuesday - is it just a language trick?

20.2k Views Asked by At

The following probability question appeared in an earlier thread:

I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?

The claim was that it is not actually a mathematical problem and it is only a language problem.


If one wanted to restate this problem formally the obvious way would be like so:

Definition: Sex is defined as an element of the set $\\{\text{boy},\text{girl}\\}$.

Definition: Birthday is defined as an element of the set $\\{\text{Monday},\text{Tuesday},\text{Wednesday},\text{Thursday},\text{Friday},\text{Saturday},\text{Sunday}\\}$

Definition: A Child is defined to be an ordered pair: (sex $\times$ birthday).

Let $(x,y)$ be a pair of children,

Define an auxiliary predicate $H(s,b) :\\!\\!\iff s = \text{boy} \text{ and } b = \text{Tuesday}$.

Calculate $P(x \text{ is a boy and } y \text{ is a boy}|H(x) \text{ or } H(y))$

I don't see any other sensible way to formalize this question.


To actually solve this problem now requires no thought (infact it is thinking which leads us to guess incorrect answers), we just compute

$$ \begin{align*} & P(x \text{ is a boy and } y \text{ is a boy}|H(x) \text{ or } H(y)) \\\\ =& \frac{P(x\text{ is a boy and }y\text{ is a boy and }(H(x)\text{ or }H(y)))} {P(H(x)\text{ or }H(y))} \\\\ =& \frac{P((x\text{ is a boy and }y\text{ is a boy and }H(x))\text{ or }(x\text{ is a boy and }y\text{ is a boy and }H(y)))} {P(H(x)) + P(H(y)) - P(H(x))P(H(y))} \\\\ =& \frac{\begin{align*} &P(x\text{ is a boy and }y\text{ is a boy and }x\text{ born on Tuesday}) \\\\ + &P(x\text{ is a boy and }y\text{ is a boy and }y\text{ born on Tuesday}) \\\\ - &P(x\text{ is a boy and }y\text{ is a boy and }x\text{ born on Tuesday and }y\text{ born on Tuesday}) \\\\ \end{align*}} {P(H(x)) + P(H(y)) - P(H(x))P(H(y))} \\\\ =& \frac{1/2 \cdot 1/2 \cdot 1/7 + 1/2 \cdot 1/2 \cdot 1/7 - 1/2 \cdot 1/2 \cdot 1/7 \cdot 1/7} {1/2 \cdot 1/7 + 1/2 \cdot 1/7 - 1/2 \cdot 1/7 \cdot 1/2 \cdot 1/7} \\\\ =& 13/27 \end{align*} $$


Now what I am wondering is, does this refute the claim that this puzzle is just a language problem or add to it? Was there a lot of room for misinterpreting the questions which I just missed?

10

There are 10 best solutions below

28
On BEST ANSWER

There are even trickier aspects to this question. For example, what is the strategy of the guy telling you about his family? If he always mentions a boy first and not a daughter, we get one probability; if he talks about the sex of the first born child, we get a different probability. Your calculation makes a choice in this issue - you choose the version of "if the father has a boy and a girl, he'll mention the boy".

What I'm aiming to is this: the question is not well-defined mathematically. It has several possible interpretations, and as such the "problem" here is indeed of the language; or more correctly, the fact that a simple statement in English does not convey enough information to specify the precise model for the problem.

Let's look at a simplified version without days. The probability space for the make-up of the family is {BB, GB, BG, GG} (GB means "an older girl and a small boy", etc). We want to know what is $P(BB|A)$ where A is determined by the way we interpret the statement about the boys. Now let's look at different possible interpretations.

1) If there is a boy in the family, the statement will mention him. In this case A={BB,BG,GB} and so the probability is $1/3$.

2) If there is a girl in the family, the statement will mention her. In this case, since the statement talked about a boy, there are NO girls in the family. So A={BB} and so the probability is 1.

3) The statement talks about the sex of the firstborn. In this case A={BB,BG} and so the probability is $1/2$.

The bottom line: The statement about the family looks "constant" to us, but it must be looked as a function from the random state of the family - and there are several different possible functions, from which you must choose one otherwise no probabilistic analysis of the situation will make sense.

5
On

The Tuesday is a red herring. It's stated as a fact, thus the probability is 1. Also, it doesn't say "only one boy is born on a Tuesday". But indeed, this could be a language thing.

With 2 children you have the following possible combinations:
1. two girls
2. a boy and a girl
3. a girl and a boy
4. two boys

If at least 1 is a boy we only have to consider the last three combinations. That gives us one in three that both are boys.
The error which is often made is to consider 2. and 3. as a single combination.

edit
I find it completely counter-intuitive that the outcome is influenced by the day, and I simulated the problem for one million families with 2 kids. And lo and behold, the outcome is 12.99 in 27. I was wrong.

2
On

There is always room for misinterpreting a question when one does not fully understand the language in which it is written. I think that the way mathematics and mathematicians use conditional probability is clear:

$$P(A|B)=P(A \cap B)/P(B).$$

So I believe that this is the interpretation that one should take, and thus arrive at your answer of 13/27, and not search for further nuances, which are not too difficult to find.

3
On

It is actually impossible to have a unique and unambiguous answer to the puzzle without explicitly articulating a probability model for how the information on gender and birthday is generated. The reason is that (1) for the problem to have a unique answer some random process is required, and (2) the answer is a function of which random model is used.

  1. The problem assumes that a unique probability can be deduced as the answer. This requires that the set of children described is chosen by a random process, otherwise the number of boys is a deterministic quantity and the probability would be 0 or 1 but with no ability to determine which is the case. More generally one can consider random processes that produce the complete set of information referenced in the problem: choose a parent, then choose what to reveal about the number, gender, and birth days of its children.

  2. The answer depends on which random process is used. If the Tuesday birth is disclosed only when there are two boys, the probability of two boys is 1. If Tuesday birth is disclosed only when there is a sister, the probability of two boys is 0. The answer could be any number between 0 or 1 depending on what process is assumed to produce the data.

There is also a linguistic question of how to interpret "one is a boy born on Tuesday". It could mean that the number of Tuesday-born males is exactly one, or at least one child.

9
On

I guess the following two versions of framing the question yield two different probabilities:

  1. Dave has two children. Is atleast one of them a boy who is born on Tuesday? Dave answers Yes.

  2. Dave has two children. I ask him to first choose and fix one child at random, and tell me if it is a boy who was born on Tuesday. Dave answers yes he is a boy born on Tuesday.

For 1st the probability (of both being boys) is 13/27, while for the second the probability is 1/2.

The way in which the question is asked, it's in line with 1st, hence the answer should be 13/27.

1
On

This, in my opinion, is why the intuitive approach fails:

One has a tendency to think that the probability of 7*P(b AND d1) = P(b AND d1) + P(b AND d2) + ... + P(b AND d7) = P((b AND d1) OR (b AND d2) OR ... OR (b AND d7)) = P(b AND (d1 OR d2 OR ... OR d7)) = P(b).

However, the flaw here is that, in reality, P(b AND d1) + P(b AND d2) + ... + P(b AND d7) is NOT equal to P((b AND d1) OR (b AND d2) OR ... OR (b AND d7)). This means that mentioning independent (and one might think irrelevant) information alongside with relevant information actually changes the resulting probabilities.

One interesting consequence: if I say something like "I have two children. One of them is a boy who was born at 10:24 PM on February 10th," The probability that I have two boys is now almost exactly the same as as the probability that I have a girl and a boy. Adding a unique or almost unique piece of information makes the stuff I want to know about the other child independent of the information I have on the first child. If I took this to the extreme and said that I have a firstborn boy, won't know anything additional about the other child.

2
On

Well, given the unstated assumption that the writer is a mathematician and therefore not using regular english, then I agree with the 13/27 answer.

But in everyday english, from "there are two fleems, one is a glarp" we all infer that the other is not a glarp.

From "there are two fleems, one is a glarp, which is snibble" we would still infer that the other is not a glarp. Whereas from "there are two fleems, one is a glarp which is snibble" (absence of comma, or when spoken, difference in intonation) we would infer that the other is not a snibble glarp, but it could still be an unsnibble glarp.

0
On

The $13/27\approx 0.481$ answer is an interesting counter-intuitive result. Here I give an intuitive explanation (which I do not think has been given in the above answers):

Intuition of why it is between $0.333$ and $0.5$:

We can distinguish the two children by firstborn and secondborn. If I tell you the firstborn is a boy, it tells you nothing about the secondborn, so the chances the secondborn is a boy is $0.5$.

If we know one was born on Tuesday, we can "roughly" distinguish the two children by "born on Tuesday" and "not born on Tuesday" (with a small probability that both were born on Tuesday). So the answer is "almost" $0.5$, since it is similar to distinguishing between firstborn and secondborn. This is the intuition why the probability changes from $0.333$ to $0.481$ based on the Tuesday info (but does not go all the way up to $0.5$).

Formal derivation of $13/27$

I did not understand the $H(x)$ notation given by the OP so I simply rewrite the argument in standard notation. Let $B_1, B_2$ be events that the firstborn and secondborn are boys, respectively. Let $T_1, T_2$ be events that the firstborn and secondborn were on Tuesday, respectively. For events $A, B$ we define $AB=A \cap B$.

\begin{align} P[B_1B_2 | B_1 T_1 \cup B_2 T_2]&=\frac{P[B_1 B_2(B_1 T_1 \cup B_2T_2)]}{P[B_1T_1 \cup B_2 T_2]}\\ &=\frac{P[B_1B_2T_1 \cup B_1B_2T_2]}{P[B_1T_1 \cup B_2 T_2]}\\ &=\frac{P[B_1B_2T_1]+P[B_1B_2T_2]- P[B_1B_2T_1T_2]}{P[B_1T_1] + P[B_2T_2] - P[B_1B_2T_1T_2]}\\ &=\frac{(1/2)^2(1/7) + (1/2)^2(1/7) - (1/2)^2(1/7)^2}{(1/2)(1/7) + (1/2)(1/7)-(1/2)^2(1/7)^2} = \frac{13}{27} \end{align} where the last line assumes independence and uniformity over boy/girl and days of week.

Of course without the Tuesday info we get $$ P[B_1B_2|B_1 \cup B_2] = \frac{P[B_1B_2]}{P[B_1\cup B_2]} = \frac{1}{3}$$

0
On

Generically, one could say that there are two halves to any math problem: First, you establish the mathematical relationships between the objects in the problem statement. Then, using mathematical theorems, you solve for the missing element. The only "language trick" here occurs when people decide that an problem statement, which is ambiguous, has an incontrovertible meaning. For example, in geometry, say we are told that two sides of a triangle have lengths 2 and 3, and that the angle between these sides is 30 degrees. There are actually two triangles that satisfy these conditions, one with three acute angles and one with an obtuse angle. But a beginning student might consider only the acute one.

The issue with probability problems like this one, is that such ambiguous details can very easily exist in a hidden manner. And even an experienced mathematician can get confused by taking the stated details too literally. For example, in the famous Monty Hall Problem, if one looks at only (1) the car was equally likely to be behind any of the three doors and (2) we know it wasn't door #3, then the correct solution is that it is still equally to be behind #1 or #2. The correct solution is that Monty Hall had to open door #3 if the car is behind door #2 (the contestant's first choice was #1). We aren't told, but have to assume, that he would pick randomly between #2 and #3 if the car is behind #1. This eliminates half of the cases where it is behind #1, but none of the cases where it is behind #2, making it now twice as likely that the car is behind #2.

I mention the Monty Hall Problem, because it belongs to the same family of problems as this one. And has, essentially, the same issues and the same solution. There are two possible ways that Gary Foshee could have arrived statement in the problem:

  1. The producers of the G4G conference could have formulated the conditions "Two children, at least one born on a Tuesday" and then sought a presenter who met the criteria. Since 27 out of every 196 parent of two children meet these requirements, and 13 of those have two boys, the answer is 13/27.
  2. The producers asked Gary Foshee to present a problem. Since he had two children, he decided to present a variation of a famous one made famous by the conference's namesake, Martin Gardner. He wanted to alter the statement to one in the form "I have two children, and at least one is a who was born on ." There were probably two such statements that apply to his children, but there is at least one. There is one case where he has no choice except "Tuesday Boy" (and so he has two boys) But there are 26 where he could choose "Tuesday Boy" but could also choose something else. If he chooses randomly, only 13 remain. In 6 of those, he has two boys. The answer is (1+6)/(1+13)=1/2.

I must stress that both are possible. But the first requires an unreasonable interpretation of the ambiguous problem.

0
On

It's not a language trick, and here's a better example to illustrate the phenomenon. Let's take 4 types of families in equal numbers:

  1. 10 boys
  2. 9 girls, then 1 boy were born
  3. 1 boy, and then 9 girls were born
  4. 10 girls

When we say "family has at least 1 boy", we end up with the same probability to get the 10-boy family: $1/3$.

But when we add a condition "family has at least 1 boy born on Tue", this filters out most of 1-boy families. But many 10-boy families will satisfy this additional condition.

So any additional condition will favour the group which has more potential to have that condition. If we change the condition to "at least one boy and none of the boys were born on Tue", this would favour the families with just one boy.