Chance of generating two zeros in a row, compared to chance of generating a zero and a one in a row when generating random numbers between 0 and 5??

662 Views Asked by At

I tried to code a simple python program which does the following: it generates random numbers between 0 and 5 until a zero is generated twice in a row. it does this N times and calculates the average number of generated numbers needed to get two zeroes in a row.

The second part of the program does exactly the same, except it stops when 1 is generated after a 0.

I expected the average number of repetitions needed in both cases would be about the same, but there is a notable difference. I cannot find an error in my code, nor in my intuition. But there is certainly an error in one (or both).

I tried to simplify my code as much as I could:

import math 
import random as rnd


def null_null():
    ct=0
    first=(math.floor(rnd.random()*6))
    second=(math.floor(rnd.random()*6))  
    while (first!=0 or second!=0):
        first=second
        second=(math.floor(rnd.random()*6))
        ct+=1
    return ct

def null_one():
    ct=0
    first=(math.floor(rnd.random()*6))
    second=(math.floor(rnd.random()*6))  
    while (first!=0 or second!=1):
        first=second
        second=(math.floor(rnd.random()*6))
        ct+=1
    return ct


res1=res2=0

N=1000
for i in range(N):
    
    res1+=null_null()
    res2+=null_one()


print("Average for 0-0: ", res1/N)
print("Average for 0-1: ", res2/N)
3

There are 3 best solutions below

0
On

Preliminary

The probability to get a $0$ is $\frac{1}{6}$. Therefore, the expected number of digits to get a $0$ is $6$.

Consecutive 0s

Let the expected sequence length to get the first consecutive $0$s be $x$. The expected length must satisfy the recursive relation:

$$ x = 6+\frac{1}{6}\cdot 1 + \frac{5}{6}\cdot \left(x+1\right) $$

Some explanations:

  • the expected sequence length to get a $0$ is $6$
  • there is a $\frac{1}{6}$ chance that the next digit is $0$ and the sequence ends with just one additional digit
  • there is a $\frac{5}{6}$ chance that the next digit is not $0$ but we expect to get consecutive $0$s in another $x$ digits (hence $x+1$)

Solving for $x$, we get $x=42$. In your script, you have two digits before the loop starts. Which is why you get on average $40=42-2$ iterations.

0 followed by 1

Let the expected sequence length to get the first $01$ be $x$. Let the expected additional sequence length to get the first $01$ when our last digit is $0$ be $y$. The two must satisfy the relation:

$$ \begin{aligned} x &= 6 + y \\ x &=6+\frac{1}{6}\cdot 1 + \frac{4}{6}\cdot \left(x+1\right) + \frac{1}{6}\cdot\left(y+1\right) \end{aligned} $$

Some explanations:

  • the expected sequence length to get a $0$ is $6$
  • there is a $\frac{1}{6}$ chance that the next digit is $1$ and the sequence ends with just one additional digit
  • there is a $\frac{4}{6}$ chance that the next digit is neither $0$ nor $1$ but we expect $01$ in the following $x$ digits (hence $x+1$)
  • there is a $\frac{1}{6}$ chance that the next digit is $0$ and we expect to get $01$ in the following $y$ digits (hence $y+1$)

Solving for $x$, we get $x=36$. Again, your loop start with two digits, so you get on average $34=36-2$ iterations.

0
On

(Aside: the random library in python has a built in function for generating integers, so you don't have to mess with floor functions. It is random.randint(start, stop)).

Now, as for the math, suppose for simplicity you are generating a binary string of zeros and ones, stopping when you see two consecutive zeros, and a zero followed by a one respectively.

Let us analyze the case where we are looking for two consecutive zeros. If the first two digits are both $0$, we are done. This occurs with probability $\frac{1}{4}$. Otherwise, we generate another digit. Given that we aren't done in the previous step, the only scenario in which we are done in the second step is if our string is now $100$.

Let us now analyze the other case. Similarly to before, if the first two digits are $0$ followed by $1$, we are done. This occurs with probability $\frac{1}{4}$. Otherwise, we generate another digit. Now comes the difference. Given that we aren't done in the previous step, we now have two scenarios in which we are done: $001$ or $101$. Hence, it is more likely to stop earlier looking for the sequence $01$, than $00$.

The same phenomenon will occur when generating integers in any range; the process will stop earlier when looking for a specific distinct pair of numbers instead of two identical numbers.

0
On

There are already some good and detailed answers. I just want to add a quick handwaving one, which may help understand where the intuition fails. Let's assume that the target sequence (00 or 01) hasn't occurred by the n-th number and then the n-th number is a 0. Your intuition is telling you that observing 00 or 01 should be equally likely and that's correct. However, where the two situations differ is what happens when the next number is not a 0 (in the first case) or a 1 (in the second case). In the first case, if digit n+1 is not a 0, then no target sequence can start at n+1 but we need to wait until at least n+2. On the contrary, when the target is 01, we might observe a 0 at digit n+1 and this might start a target sequence starting at n+1. In other words, with 00 if the n-th digit is a zero but no target sequence starts at n, we know for sure that no target sequence starts at n+1. This doesn't happen when the target is 01.