Probability of getting 6 heads in a row from 200 flips and intuition about this high value

6.1k Views Asked by At

A few days ago I had an argument with a friend about this question :

What is the probability of getting 6 heads in a row from 200 flips ?

I argued it is high probability (significantly bigger than a half) while he argued it is low probability.

When I tried to give exact formula I failed, so we checked the web were the answer was about 84%, yet he is still not convinced so from this I have two questions:

  1. What is the exact formula for $k$ Heads in a row (consecutive) out of $n$ coin flips?

  2. (Not a mathematical) How to convince my friend that 6 in a row have high probability? Meaning what is the intuition behind the question ?

4

There are 4 best solutions below

0
On BEST ANSWER

Here is how to calculate the exact answer. Consider a Markov chain $X_0,X_1,\ldots,X_{200}$, taking integer values in the range $0\le X_n\le 6$, with transition matrix (with row and column indices in the range $0\le i,j\le6$) $$M=\pmatrix{\frac12&\frac12&0&0&0&0&0&\\ \frac12&0&\frac12&0&0&0&0&\\ \frac12&0&0&\frac12&0&0&0&\\ \frac12&0&0&0&\frac12&0&0&\\ \frac12&0&0&0&0&\frac12&0\\ \frac12&0&0&0&0&0&\frac12\\ 0&0&0&0&0&0&1}$$ Here the idea is that $X_n$ represents the number of consecutive heads ending at flip $n$ (with the conventional courtesy value $X_0=0$) so that the flip sequence HTHH would cause $X_0=0$, $X_1=1$, $X_2=0$, $X_3=1$, $X_4=2$, and so on. Except the value $X_n=6$ means something different: either $X_{n-1}=6$ or the $n$-th flip was H and $X_{n-1}=5$. The chain is started with the value $X_0=0$; what is sought is the probability that $X_{200}=6$. This is the $(0,6)$-th entry in the matrix $M^{200}$. When I do these calculations I get a value very close to $.8$.

Here is a way to visualize this chain. There is a small spider that aspires to climb to the top of a $6$ segment pipe. It starts at the bottom, and does the following $200$ times:

  • if it is at the top, it stays at the top,
  • otherwise it flips a coin and drops all the way to the bottom if the coin shows T, and climbs up more more segment if the coin shows H.
1
On

This kind of varies from the original question, but here it goes. There are 64 outcomes that could happen when flipping a coin $6$ times. $2^6=64$. The probability of getting heads 6 times in $6$ flips is $1/64$. $200/6=33.333(33)$. $$\left(1-\dfrac 1 {64}\right)^{33}=(.6)$$ roughly) But, you could get $6$ tails. $2$ outcomes of a $6$ streak. Your chances double which means it is roughly $90$%

0
On

All the possible different results of $n$ flips of a (fair) coin correspond to all possible binary strings of length $n$, which clearly are $2^n$.
So let's talk of binary strings, with $1=H, \; 0=T$.

a) Exact formula

Partition the $2^n$ strings into those containing $s$ ones and $m=s-n$ zeros.
Each partition will contain $\binom{n}{s} =\binom{n}{m} $ strings.

Now the number of strings with $s$ "$1$" and $m$ "$0$", having runs of at most $r$ consecutive ones
is given by $$ N_b (s,r,m + 1)\quad \left| {\;0 \leqslant \text{integers }s,m,r} \right.\quad = \sum\limits_{\left( {0\, \leqslant } \right)\,\,k\,\,\left( { \leqslant \,\frac{s} {r}\, \leqslant \,m + 1} \right)} {\left( { - 1} \right)^k \left( \begin{gathered} m + 1 \\ k \\ \end{gathered} \right)\left( \begin{gathered} s + m - k\left( {r + 1} \right) \\ s - k\left( {r + 1} \right) \\ \end{gathered} \right)} $$ whose generating function in $s$ is $$ F_b (x,r,m + 1) = \sum\limits_{0\,\, \leqslant \,\,s\,\,\left( { \leqslant \,\,r\,\left( {m + 1} \right)} \right)} {N_b (s,r,m + 1)\;x^{\,s} } = \left( {\frac{{1 - x^{\,r + 1} }} {{1 - x}}} \right)^{m + 1} $$ as explained in this related post

Therefore the number of binary strings with $n$ bits, having runs of at most $r$ consecutive ones
is $$ \bbox[lightyellow] { \eqalign{ & C(n,r) = \sum\limits_{\left( {0\, \le } \right)\,\,m\,\,\left( { \le \,n} \right)} { N_b (n - m,r,m + 1)} = \cr & = \sum\limits_{\left( {0\, \le } \right)\,\,m\,\,\left( { \le \,n} \right)} { \sum\limits_{\left( {0\, \le } \right)\,\,k\,\,\left( { \le \,{s \over {r + 1}}\, \le \,m + 1} \right)} {\left( { - 1} \right)^k \binom{m+1}{k} \binom{ n - k\left( {r + 1} \right)} {n - m - k\left( {r + 1} \right) } } } \cr} } \tag{1}$$ whose generating function in $n$ is $$ G(z,r) = \sum\limits_{0\, \le \,n} {C(n,r)z^{\,n} } = {{1 - z^{\,r + 1} } \over {1 - 2z + z^{\,r + 2} }} $$ This is interesting because from it it is possible to demonstrate that the above approach coincides with the Markovian approach proposed by @kimchi lover.

We conclude that the probability you are looking for is $$ \bbox[lightyellow] { P_{\,200,\,6} = 1 - {{C(200,5)} \over {2^{\,200} }} = 0.8009 \ldots } \tag{2}$$

b) * Asymptotics*

For large $n$ the exact formula (1) would become quite "heavy", and we need an asymptotic formula.

A very interesting fact is that $C(n,r)$ are just the $(r+1)$-nacci numbers, apart from a shift in $n$ depending on their definition.

So $C(n,5)$ corresponds to the Hexa-nacci numbers $$ C(n,5) = F_{n + 6}^{\left( 6 \right)} $$ as defined in Wikipedia.
In this paper "A Simplified Binet Formula for k-Generalized Fibonacci Numbers" - G.P.B. Dresden, Z. Du you can find interesting asymptotic formulas to approximate the Hexa-nacci with a Binet-like formula.

c) Convincing your friend

This is clearly .. the most difficult part.
But you can try via .. [Entropy](https://en.wikipedia.org/wiki/Entropy:
the strings in which $r$ is allowed to go from $6$ up to $200$ are far more "chaotic" than the strings where $r$ is constrained not to overcome $5$.

0
On

A bit late, I ran some trials to see.

I did 5000 trials on a sim I wrote in python. Other readers have done 33 sets of 6 tosses, which is not what you asked for. In the case of TTTHHH HHHTTT TTTHHH, that method would say there are 0 streaks of 6.

Here is the code I used: def toss100(): numStreak = 0 truestreak = 0
falsestreak = 0 for x in range(200):

     flip = coin()
     #print(flip)

     if (flip == True):
         falsestreak = 0
         truestreak += 1
     elif (flip == False):
         falsestreak += 1
         truestreak = 0

     #print("heads streak: ", falsestreak)
     #print("tails streak: ", truestreak)    
     if (falsestreak == 6 or truestreak == 6):
         numStreak += 1
         falsestreak = 0
         truestreak = 0



return numStreak

total = 50000 streaks = 0 for i in range(total): streaks += toss100() print ("Total Streaks: ", streaks) print("Streaks per 200 tosses: ", streaks/50000)

And the output: Total Streaks: 155647 Streaks per 200 tosses: 3.11294

Hope this helps.