Question on dice rolling -- expected number of rolls to get a particular sequence

642 Views Asked by At

Came across this solution to problem on expected number of dice rolls to get 1,2,3,4,5,6. I reproduce it below:

For $i=1$ to $n$, define Bernoulli random variables $X_i$ by $X_i=1$ if at $i$ we have the beginning of the sequence $123456$, and by $X_i=0$ otherwise. Then $Y=\sum_1^n X_i$ is the number of times the sequence $123456$ appears.

By the linearity of expectation, we have $E(Y)=\sum_1^n E(X_i)$. But for any $i$ which is not too large, $E(X_i)=\Pr(X_i=1)=\frac{1}{6^6}$. It follows that $E(Y)=\frac{n-5}{6^6}$.


I believe I follow the solution. But it seems like, if we go for any other sequence, like 111222, I would get the same result, since the linearity of expectation doesn't require random variables be independent.

Though intuitively, 111222 should have fewer expected number of rolls than 123456? Since for 123456, I have to get reach roll exactly right, otherwise I have to start over for the desired sequence. But for 111222, it doesn't matter if I get 10 1's in a row, as long as I get next roll = 2, I can keep going? i.e., I don't have to "start over", I can resume halfway. Actually, this solution explicity pointed out that for other sequences, the number of rolls are different.

So, does the provided answer doesn't work for arbitrary sequence? If not, why?

[EDIT after Joriki's answer] My main question is why the linearity of expectation doesn't work. Yes I get that in the approach of linearity of expectation, we are looking for n such that E(y)=1, which might be a different question from "expected number of rolls until we hit 123456", but I struggle to understand why they are different.

1

There are 1 best solutions below

4
On BEST ANSWER

It seems that you misunderstood the first question you linked to, possibly due to its ungrammatical formulation. The first linked question says “we count the number of time when the sequence $123456$ appears”, which is unclear because it’s ungrammatical, but the rest of the question suggests that the intended meaning was “we count the number of times the sequence $123456$ appears”, and this is accordingly also the formulation used in the accepted answer.

Your question about the expected number of rolls to get $123456$ (which is also the question asked in the second question you linked to) is a very different question – the answer is much larger (of the order of $6^6$ instead of $6^{-6}$); it cannot be answered by just applying linearity of expectation; and you’re right in thinking that the answer to this question does depend on the particular sequence.