Why does successive rolls on a die increase probabilty?

171 Views Asked by At

So in my maths textbook the questions asks: "What is the probability of rolling at least one 5 in 6 rolls".

So of course the probability of rolling a 5 is 1/6. Supposedly rolling the die 6 times then increases the probability from 1/6 to 1-(5/6)^6.I understand how to calculate it, just not why probability would increase when the first roll doesn't affect the second roll etc. Yes successive rolls means you might have failed the first roll, but upon throwing the second time it's still 1/6 chance of throwing a 5?!? I mean even upon throwing the 1000th time, you would still have a 1/6 chance...

Thanks!

2

There are 2 best solutions below

1
On

Repeatedly rolling a die does not increase your chance of getting a $5$ in a single roll - it's still $1/6$. But $1-(5/6)^6$ is not the probability of getting a $5$ in a single roll, but the probability of getting at least one $5$ in $6$ rolls. Think of the following two scenarios:

scenario 1.

Roll a die $N$ times, and record the number of times it's a $5$. If $N$ is large enough, you should find that approximately $N/6$ of them are $5$'s.

scenario 2.

step 1). initialize a variable, let's say $x$. So $x = 0$ initially.
step 2). roll a die $6$ times. If there is one or more than one $5$ (i.e. at least one $5$) in these $6$ rolls, add $1$ to $x$.
step 3). repeat step 2) $N$ times. If $N$ is large enough, you will find that $x$ is approximately equal to $N(1-(5/6)^6)$

Is it more clear now? $1/6$ and $1-(5/6)^6$ are probabilities that correspond to different scenarios.

1
On

The key to the answer is the "at least one" part of the question. The difficulty you're having is that you're applying logic regarding a simple event (a single "test" with one outcome) to a compound event (multiple simple events, with the outcome determined by some combination of the simple events), when the probabilities of compound events are by nature more complex.

To restate the question:

What is the probability of rolling at least one 5 in 6 rolls [of a 6-sided die]?

That probability is complex to calculate as such, because it is the sum of the probability of all possible outcomes of six rolls of the die in which a 5 appears at least once. There are six possible outcomes of any roll, and you roll six times, so the probability tree has $6^6 = 46656$ leaves (possible unique cases). There are 46,656 unique sequences of the 6 possible values taken 6 at a time "with replacement" (which is the case for independent events like dice rolls, where the same value can be seen multiple times).

That's a lot of possibilities, and what you say is exactly true; regardless of what you've already rolled so far, your next roll could be a 5. So, at each tier of the tree, that simple event can occur that means the compound event has occurred, and that means all the remaining rolls are immaterial because we already know the outcome. If you roll a 5 first time, no matter what you get on the other 5 rolls ($6^5 = 7776$ possibilities), the event's occurred. If you don't get it on the first roll (5 subtrees), you can get it on the second, and that's an additional 1296 cases of the remaining four rolls, and this identical subtree occurs as one possibility of each of the five subtrees where we didn't get a 5 the first time (6480 total additional cases).

If you don't get it in two tries, you can get it on the third, and the remaining three rolls are immaterial (216 cases), and each of those 216 cases can occur given any of $5*5 = 25$ possibilities where a 5 didn't occur on the first two rolls, for another 5400 total unique cases that mean the event occurs. A five could still be rolled for the first time on the fourth roll (4500 cases), the fifth roll (3750) or the very last one (3125).

On and on this goes, until you've counted all unique subtrees that contain at least one 5 (spoiler alert, there are 31031). Divide that by the total number of unique cases of rolling a fair die 6 times, 46656, and you have your answer. This is a relatively complex analysis involving keeping track of the number of subtrees that haven't yet seen a 5, and the number of possible remaining paths that could have one. Trust me, it's pretty easy to screw up.

And this is an easy case; what if the question asked the chances of getting at least two fives? Now you can't just take all the remaining paths for granted once you see one, as we did above; you have to keep tracing paths where you have seen none, or only one, until you see two. And what if there weren't 6 rolls allowed, but 10? That's 60 million (60,466,176) possible unique sequences of dice values to have to go through.

The manual calculation of these kinds of compound events, where you don't care exactly when a simple event happens as long as it's within the total number of tests being performed, quickly becomes unmanageable to handle in this relatively brute-force manner of parsing the decision tree.


However, there's an easier way than all of the above.

Recall that given an event $E$ with probability $P(E)$, the probability of that event not occurring, $P(E')$, is $P(E') = 1 - P(E)$. This is trivial enough to see in a single roll of the die; with 6 different sides and an equal chance to land on any of them, the chance you will land on the "fifth side" is $\frac16$, and the chance you don't is the chance of landing on any of the others, $\frac56$.

However, this simple relationship between these these two outcomes gives us a massive leg up in situations like the asked question. Rather than calculate the probability of the event occurring, what if we looked at the probability of the event not occurring? You will either see at least one five in six rolls, or none at all, so the probability of seeing at least one is the same as the probability of not seeing none.

Well, what's the probability that you don't roll a 5 even once in 6 chances? It's the chance of an outcome in which roll 1 is not a 5, and roll 2 is not a 5, and roll 3 is not a five, and so on through the other 3 rolls. "And" means "times" when you talk about compound probabilities, so the chance of not seeing a five rolled at all in 6 tries is $\frac56 * \frac56 * \frac56 * \frac56 * \frac56 * \frac56 = \frac56^6 \approx 0.3349$.

That's the chance of seeing none at all, so the probability of seeing at least one, the probability of "not seeing none", is $1-\frac56^6 \approx 0.6651$. Just to check, since we did all that work earlier, $\dfrac{31031}{46656} \approx 0.6551$. So, yes, the decision tree method does work, but so does this. And, I hope you'll agree, the calculation is far simpler.


To address the more complex examples I mentioned, let's do a little extra credit here, and consider the probability of seeing at least two fives in six rolls. That analysis is similar in concept, but the decision tree method is now totally impractical; we'd be considering so many sub-cases of all possibilities that this answer would be a true book just to do the calculations, and as you saw, it's not necessary.

Instead, consider the opposite of what we're asked. The chance of seeing at least two fives is the chance we don't see zero and that we don't see only one. Stated equivalently, it's the chance we don't see either zero or one. We have already calculated the probability of getting zero fives (0.3349), so now we just need to know, what's the chance of only getting one?

The probability of getting only one five is the probability that it happens exactly once, meaning it doesn't happen the other 5 times out of 6 tries. So given the six total tests, we want one test outcome that has a 1/6 simple probability, and five test outcomes with a 5/6 probability. That's also easy math: $\frac16 * \frac56^5 = 0.06698$. This is a much smaller number than the ones we've dealt with so far, and that's understandable; back to the decision tree, the concept is that, instead of including the entire subtree of possible outcomes once we see a 5, we can only include the cases where we don't see another 5. That's a much smaller number of situations.

Now we know the probability of none, and the probability of one. The probability of at least two is the probability of "not zero or one", so $P(\ge2) = 1 - (0.3349 + 0.06698) = 0.59812$.

This leads us to a critical concept you will probably see very soon, called the "binomial distribution". This is a general-purpose formula for "the probability of exactly X occurrences of an event with probability P in N total trials":

${N\choose X} = (P)^X * (1-P)^{N-X}$

You saw this formula when I was calculating the probability of exactly one 5; it is, very simply, the probability that the event occurs the stated number of times, and that it doesn't occur any of the other possible times in the overall N trials.

This formula is ridiculously important in statistics and probability. If you plot/graph the binomial distribution for a sufficiently large N, and choose your axis scales appropriately to frame the graph, you get a shape that will become sickeningly familiar to you:

enter image description here