throwing a $k$-sided dice until you get every face the same number of times

1.6k Views Asked by At

You throw a fair $k$-sided dice repeatedly and keep track of all the results. You stop if you have seen every face exactly the same number of times. What is the expected number of times you have to throw the dice? Is that number even finite?

Start with the simplest example of a $2$-sided dice, ie a fair coin. There is a 50% chance that after 2 flips you will have one head and one tail so you stop. There is a 12.5% chance that after 2 flips you will be at 2:0 and continue but after 4 flips you will be at 2:2, so you stop. One can compute a few more terms but I don't see enough of a pattern to get a summable series.

It is not too hard to compute the probability that after $k\cdot m$ throws of a $k$-sided dice you get every face exactly $m$ times but these probabilities are not independent for different $m$ so I'm not sure whether that is helpful.

The relation to random walks looks useful. One can map the results of throwing a coin repeatedly to a random work on the integers. Seeing the same number of heads and tails is equivalent to coming back to the starting point (at zero). This is a well studied problem and it is known that you come back to zero with probability one in a finite time. The number I'm looking for would be the expected time when you get back to zero for the first time but I haven't found any results on that.

For $k$ bigger than $2$ (and even) the mapping to random walks is not bijective anymore. If you map the outcome of a $6$-sided dice throw to a random walk on the $3$-dimensional integer lattice with each face corresponding to a step in one of the 6 directions that a return to the origin is necessary but not sufficient for having seen each face the same number of times. However, it is known that for a random walk on the integer lattice in dimension of at least $3$, the chance to return to the origin in finite time is strictly smaller than $1$. So this should prove that for a dice with at least $6$ faces, the expected number of throws is not finite.

Is this a finite number for $k=2$ and $k$ between $3$ and $5$? Are there estimates for it?

3

There are 3 best solutions below

9
On BEST ANSWER

Answer concerning $k=2$. For $k\geq2$ see edit.

If $X_2$ denotes the number throws needed then $X_2$ only takes positive multiples of $2$ as value, and this with: $$P(X_2=2n)=2^{1-2n}C_{n-1}=2^{1-2n}\frac{(2n-2)!}{n!(n-1)!}$$for $n=1,2,\dots$, leading to:$$\mathbb EX_2=\sum_{n=1}^{\infty}2^{2-2n}\binom{2n-2}{n-1}=\sum_{n=0}^{\infty}2^{-2n}\binom{2n}{n}$$

Here $C_n$ stands for the $n$-th Catalan number.

In this case ($k=2$) we can do it with a fair coin.

Let $H_k$ stand for the number of throws with outcome heads and let $T_k$ for the number of throws with outcome tails by $k$ throws.

Then have a look at the path determined by $(k,H_k-T_k)$ for $k=0,1,\dots,2n$ under condition that $X_2=2n$.

Starting at $(0,0)$ and ending at $(2n,0)$ first WLOG we go to $(1,1)$.

Then with probability $2^{2-2n}C_{n-1}$ from there we go to $(2n-1,1)$ keeping a positive second coordinate.

And then with probability $2^{-1}$ we go to $(2n,0)$.

We have the asymptotic equality: $$2^{-2n}\binom{2n}{n}\sim\frac{1}{n^{\frac12}\sqrt\pi}$$and conclude that: $$\mathbb EX_2=+\infty$$


edit (to make the answer complete):

If $k>2$ let $U$ denote the outcome at first throw and let $V$ be (e.g.) the smallest element of $\left\{ 1,\dots,k\right\} $with $U\neq V$.

Now define $Y$ as the number of throws needed (inclusive the first) to come back again in the situation that outcome $U$ and outcome $V$ have appeared the same number of times.

Comparing $Y$ and $X_{2}$ we find easily that $P\left(Y>n\right)\geq P\left(X_{2}>n\right)$ for every nonnegative integer $n$ and consequently: $$\mathbb{E}Y=\sum_{n=0}^{\infty}P\left(Y>n\right)\geq\sum_{n=0}^{\infty}P\left(X_2>n\right)=\mathbb{E}X_{2}$$

Next to that it is evident that $X_{k}\geq Y$ so that $\mathbb{E}X_{k}\geq\mathbb{E}Y$.

Combining this with $\mathbb{E}X_{2}=+\infty$ we conclude that $\mathbb{E}X_{k}=+\infty$ for $k=2,3,\dots$

2
On

Launching the die $n$ times, we have a word of length $n$ from the alphabet $\{1,\cdots,k\}$.
There are $k^n$ different words that may be composed.

Each word corresponds to a term of the expansion of the multinomial $$ \left( {x_{\,1} + x_{\,2} + \cdots + x_{\,k} } \right)^n \quad \left| {\;x_{\,j} = j} \right. $$

If we take a hystogram of the number of times that each caracter appears, each will correspond to $$ x_{\,1} ^{\,j_{\,1} } x_{\,2} ^{\,j_{\,2} } \cdots x_{\,k} ^{\,j_{\,k} } \quad \left| {\;\sum\limits_l {j_{\,l} } = n} \right. $$ so that the number of different words having the same repetition hystogram $$ \left( {x_{\,1} + x_{\,2} + \cdots + x_{\,k} } \right)^n = \sum\limits_{j_{\,1} + j_{\,2} + \cdots + j_{\,k} = n} {\left( \matrix{ n \cr j_{\,1} ,j_{\,2} , \cdots ,j_{\,k} \cr} \right)x_{\,1} ^{\,j_{\,1} } x_{\,2} ^{\,j_{\,2} } \cdots x_{\,k} ^{\,j_{\,k} } } $$ will correspond to the relevant multinomial coefficient.

Now, suppose you continue to roll the die up to $n$ times without stopping.
You want to know the probability of obtaining that each face appeared $t$ times, which of course means that $n=tk$.
The number of words presenting a flat hystogram will be $$ \bbox[lightyellow] { N(t,k) = \left( \matrix{ t\,k \cr \underbrace {t,t, \cdots ,t}_k \cr} \right) = {{\left( {t\,k} \right)!} \over {\left( {t!} \right)^{\,k} }}\quad \Rightarrow \quad P(t,k) = {{\left( {t\,k} \right)!} \over {\left( {t!} \right)^{\,k} k^{\,t\,k} }} } \tag{1}$$ and the corresponding probability $P(t,k)$ is a "cumulative" one, since it includes the probability that you might have had an equal number (less than $t$) of appearances in the preceding rolls.
For $t=0$ we have $P(0,k)=1$, corresponding to the empty word.

To proceed and find the requested probability, consider a word of length $tk$ which, besides at $n=tk$, also achieves a number $s<t$ of equal faces in between (and might also include other "equalities" before and/or after that) $$ \left[ {\underbrace {x_{\,a} ,x_{\,b} , \cdots ,x_{\,c} }_{s\,k},\underbrace {x_{\,d} ,x_{\,e} , \cdots ,x_{\,f} }_{\left( {t - s} \right)\,k}} \right] $$ then the second part can be considered as a "fresh starting" word, and we may put $$ P(t,s,k) = P(s,k)P(t - s,k) $$

So that the requested probability, i.e. the probability $p(t,k)$ of getting equal number of faces at $n=tk$ and not earlier
will be $$ \eqalign{ & P(0,k) = p(0,k) = 1 \cr & P(1,k) = p(1,k) = {{k!} \over {k^{\,k} }} \cr & P(2,k) = P(0,k)p(2,k) + P(1,k)p(1,k)\quad \Rightarrow \cr & \Rightarrow \quad p(2,k) = P(2,k) - P(1,k)^{\,2} = \left( {{{\left( {2\,k} \right)!} \over {\left( {2!} \right)^{\,k} }} - \left( {k!} \right)^{\,2} } \right){1 \over {k^{\,\,2\,k} }} \cr & \quad \quad \vdots \cr & P(t,k) = \sum\limits_{0\, \le \,s\, \le \,t - 1} {P(s,k)p(t - s,k)} \cr} $$ which is the recursive relation $$ \bbox[lightyellow] { \left\{ \matrix{ p(0,k) = 1 \hfill \cr p(t,k) = {{\left( {t\,k} \right)!} \over {\left( {t!} \right)^{\,k} k^{\,t\,k} }} - \sum\limits_{1\, \le \,s\, \le \,t - 1} { {{\left( {s\,k} \right)!} \over {\left( {s!} \right)^{\,k} k^{\,s\,k} }}p(t - s,k)} \hfill \cr} \right. } \tag{2}$$

Example

With $k=2$ and $t=0 \cdots 6$ the formula above gives $$p(t,k)= 1, \frac{ 1}{2}, \frac{ 1}{8}, \frac{ 1}{16}, \frac{ 5}{128}, \frac{ 7}{256}, \frac{ 21}{1024}$$ which matches with the formula given in the accepted answer.

With $k=3$ instead, and $t=0 \cdots 6$, we get $$p(t,k)= 1, \frac{ 2}{9}, \frac{ 2}{27}, \frac{ 272}{6561}, \frac{ 1646}{59049}, \frac{ 3652}{177147}, \frac{ 231944}{14348907}$$

Both results have been checked vs. direct computation for the lowest values of $t$.

2
On

Take a $2$-sided die and let $N$ be the number of rolls taken to get from a position where the numbers of 1s and 2s differ by $1$ to a point where they are equal. This has the same distribution as reducing a difference of $2$ to $1$.

Suppose $\mathbb E(N)<\infty$. Then we have $\mathbb E(N)=1+\frac12\times2\mathbb E(N)$, since after the first roll with probability $\frac12$ you succeeded and with probability $\frac12$ you have a difference of $2$, which you have to reduce twice. But this equation has no finite solution, contradiction.

It follows that the expectation is infinite for any number of sides, since after the first roll two sides have come up different numbers of times, and even waiting until those two are equal has infinite expectation.

As you say, for six or more sides there is a positive probability of never equalising. I suspect that six is not the smallest value for which this holds.