Why does probability change as you change perspective?

2k Views Asked by At

I was trying to solve the following question:

Out of 2 Boys and 2 Girls, two students are chosen to advance to the next level. What is the probability that two girls advance to the next level

However, because the question was ambiguous I calculated the probabilities considering all four cases of whether they were distinguishable or indistinguishable, and whether order mattered or didn't matter and got different probabilities for each case.

However, I want to know why this happens. All you are doing is simply picking two students and seeing if they are both girls. You keep doing this and after infinite trials, divide the number of times they were both girls by the number of total trials. How does this outcome depend upon whether you view them as distinguishable, indistinguishable, ordered, or non-ordered?

Cases:

  1. Indistinguishable, order matters: $\frac{1}{2\cdot 2}=\frac{1}{4}$ (Cases are BB, BG, GB, GG)

  2. Indistinguishable, order does not matter: $\frac{1}{3}$ (Cases are BB, B+G, GG)

  3. Distinguishable, order matters: $\frac{2}{\text{Permutation}(4,2)} = \frac{1}{6}$

  4. Distinguishable, order does not matter: $\frac{1}{\binom{4}{2}}=\frac{1}{6}$

2

There are 2 best solutions below

3
On BEST ANSWER

The difference comes about due to ambiguity of the instruction, "choose a student." The common interpretation (though this is certainly up for debate, and a reason that it helps to be extremely precise when asking probability questions) is that you are sampling from a uniform distribution on the set $S$ of pairs of students, i.e. elements of the power set of students that contain exactly two elements: you can then set up the calculation in many ways (first pick the first student, then pick the second student: $\frac{1}{2}\cdot\frac{1}{3} = \frac{1}{6}$; sample $S$ directly: $\frac{2}{4\cdot 3} = \frac{1}{6}$; map the uniform distribution on $S$ to a new distribution (which happens to still be uniform) on the quotient space where order doesn't matter: $\frac{1}{\binom{4}{2}}=\frac{1}{6}$; etc.)


Your calculations are done under a different interpretation: that you are sampling uniformly from the set of however you're representing the outcomes. For a clearer example of the distinction, suppose you had a classroom of 1 million girls and one single boy. Now pick two students; what is the probability that you have one of each sex? If you uniformly pick from the set of pairs of students, the probability is overwhelming that you will get two girls. If you sample uniformly from the set of pairs of sexes where order doesn't matter ({BG},{GG}), you will get $\frac{1}{2}$. Usually when we say "pick randomly" we mean the former, but like you say the latter answer is not a "wrong" interpretation per se.


EDIT: A bit more explanation that might be helpful. The first, simple and intuitive formula we learn for finding the probability of something happening is $$\frac{\textrm{# of outcomes we want}}{\textrm{total # of outcomes}}.$$ This formula is only true, however, when all of the outcomes are equally likely. So, for instance, rolling a 2 on a fair 6-sided die has probability $\frac{1}{6}$, because you are equally likely to roll any of the six numbers. What's the probability of rolling a 2 on a loaded die? It all depends on how the die is loaded. Different ways of loading the die will give you different results, and you will need to modify the formula above to correctly weight the different outcomes based on exactly how the die is loaded.

Now, to your problem of picking students. The usual interpretation of the problem you quoted is that you want to pick two students so that each pair of students is equally likely to get picked. (Whether or not the pairs are ordered turns out not to matter, but for now let's say the pairs are ordered.) This is how you get the probability $\frac{2}{12}$ for picking two girls, using the formula above.

This brings us to key point #1: if you pick pairs of ordered students, with each pair equally likely, you could instead pick unordered pairs of sexes: but each unordered pair is not equally likely! In my example above (it's harder to get confused when there is a huge disparity in the numbers), we can replace a fair $1000001\cdot 1000000$-sided die with a 2-sided die by looking at pairs of unordered sexes -- but this 2-sided die is now loaded, because it's much more likely in the first way that you'll pick two girls than one girl and one boy! To get the same answered using unordered sexes instead of ordered names, you need to use a more sophisticated formula that takes into account the relative probability of the various outcomes.

Point #2: We interpreted "choose two students" to mean "pick two students so that each ordered pair of students is equally likely." This is purely a convention -- "choose two students" on its own is mathematically meaningless. You have to provide the rule that says how likely each outcome is relative to the others -- mathematically speaking this is called a probability distribution. It's like if I ask you, "what's the probability of rolling a 1 on this die that I'm holding?" Is it a loaded die? A fair die? How is it loaded? If I don't tell you, there is no right answer to my question. Similarly, you have to say how you're picking the students, or there's no single right answer to the question you quoted. The interpretation I've used throughout the above is the one that I think most people are likely to say is the "natural" one, but that's a matter of convention and taste, not mathematics.

Incidentally, failing to precisely define the probability distribution you're using is the source of a large number of "paradoxes" and controversies in probability. You might be interested in hearing about the Monty Hall problem, where ambiguity about the assumptions of the problem have caused years worth of controversy (including between mathematics PhDs!)

5
On

You have used various sample spaces to answer the question. There is nothing wrong with that, as long as you calculate correctly the probabilities for the elements of your chosen sample space.

For some of your choices of sample space, one can make an argument that the elements of the sample space are equally likely. It is convenient to use such a sample space, if possible, for then the probability calculation comes down to counting the "favourables."

For your sample space with the people indistinguishable, it would be hard to argue that double boy is just as likely as mixed. For it is clear that if the people being chosen are sitting behind four separate but identical screens, then the probability of a mixed choice is exactly the same as the probability if the choice is made at random, with all pairs of people equally likely to be chosen.

Behind every "real-world" probability calculation, there is a mathematical model, in this case a mathematical model of the choosing process. To decide on an appropriate mathematical model, it is necessary to know in fair detail what the choosing process was. For the situation of the problem, a reasonable model is that a the $\binom{4}{2}$ pairs of people, or the $(4)(3)$ ordered pairs of people, are equally likely.