Expected number of rolls for an unfair die to get all possibile values at least once

1.4k Views Asked by At

Suppose that we have a 6-sided unfair dice, where rolling a 1 is twice as likely as rolling any other number, and the other numbers have the same likelihood. What is the expected number of rolls to get each value at least once?

Thus, $$p(1) = 2/7\qquad p(2) = p(3) = \cdots = p(6) = 1/7$$

I understand how to approach this when the probabilities are the same, as it's just the Coupon collector's problem, but throwing in a non-uniform probability distribution throws me off. I know there is a general solution for this problem, but I can't seem to get any intuition as to why it's the case.

4

There are 4 best solutions below

2
On BEST ANSWER

Consider the die as a $7$-sided die with two $1$s and wait for all sides to appear. With probability $\frac27$, you unnecessarily waited for the second $1$ for an expected number of $7$ rolls at the end. Correcting for that yields

$$7H_7-\frac27\cdot7=\frac{323}{20}\;.$$

0
On

For $k=0,1,2,3,4,5$ let $S_{k}$ denote the status where $1$ has not been rolled yet and exactly $k$ of the other numbers have been rolled.

For $k=0,1,2,3,4,5$ let $T_{k}$ denote the status where $1$ has been rolled and exactly $k$ of the other numbers have been rolled.

Let $\mu_{k}$ denote the expectation of rolls still to go starting in status $S_{k}$.

Let $\nu_{k}$ denote the expectation of rolls still to go starting in status $T_{k}$.

To be found is $\mu_0$ and we have the following relations:

$$\nu_{5}=0\tag1$$ $$\mu_{5}=1+\frac{2}{7}\nu_{5}+\frac{5}{7}\mu_{5}=1+\frac{5}{7}\mu_{5}\tag2$$

for $k=0,1,2,3,4$:

$$\mu_{k}=1+\frac{2}{7}\nu_{k}+\frac{k}{7}\mu_{k}+\frac{5-k}{7}\mu_{k+1}\tag3$$

$$\nu_{k}=1+\frac{k+2}{7}\nu_{k}+\frac{5-k}{7}\nu_{k+1}\tag4$$

Note that (2) leads to $\mu_5=3.5$. Substituting this together with $\nu_5=0$ in (3) and (4) gives you for $k=4$ two equalities that enable you to find $\nu_4$ and $\mu_4$. This proces can be repeated till you arrive at values for $\nu_0$ and (our final goal) $\mu_0$.

1
On

Say $T_k$ is the time it takes to get from the $(k-1)$-th to the $k$-th new roll, and $C_k$ is the value of the $k$-th new roll.

You can calculate $\mathbb{E}[T_k]$ by $\mathbb{E}[\mathbb{E}[T_k | C_1,...,C_{k-1}]]$ instead. For example,

$$\mathbb{E}[T_2] = \mathbb{E}[\mathbb{E}[T_2 | C_1]] = \sum_{j=1}^6 \mathbb{E}[T_2 | C_1 = j] \mathbb{P}(C_1 = j) = \frac{7}{5} \cdot \frac{2}{7} + 5 \cdot \frac{7}{6} \cdot \frac{1}{7} = \frac{37}{30}.$$

This is worked out in general on page $9$ and $10$ of this link.

0
On

We describe the scenario using an eleven state markov chain. (tedious, I know...if all six die results have different probabilities though, could go upwards of $2^6$ states though, so at least we could trim it down to only eleven...)

Here is a transition diagram.

unfair die couponcollector transition diagram

To give names to states, I will refer to them as $(yes,k)$ for if $1$ has been seen and $k$ not-1 numbers have been rolled, and $(no,k)$ for if $1$ has not been seen and $k$ not-1 numbers have been rolled.

The transition matrix with order of rows/columns as $(yes,5),(yes,0),(yes,1),\dots,(yes,4),(no,1),(no,2),\dots,(no,5)$

$$\begin{bmatrix} 1&0&0&0&0&1/7&0&0&0&0&2/7\\ 0&2/7&0&0&0&0&0&0&0&0&0\\ 0&5/7&3/7&0&0&0&2/7&0&0&0&0\\ 0&0&4/7&4/7&0&0&0&2/7&0&0&0\\ 0&0&0&3/7&5/7&0&0&0&2/7&0&0\\ 0&0&0&0&2/7&6/7&0&0&0&2/7&0\\ 0&0&0&0&0&0&1/7&0&0&0&0\\ 0&0&0&0&0&0&4/7&2/7&0&0&0\\ 0&0&0&0&0&0&0&3/7&3/7&0&0\\ 0&0&0&0&0&0&0&0&2/7&4/7&0\\ 0&0&0&0&0&0&0&0&0&1/7&5/7\end{bmatrix}$$

Following properties of absorbing markov chains, we look to the fundamental matrix $(I-R)^{-1}$

Using a calculator,

enter image description here

enter image description here

Now, since we start in state $(yes,0)$ with probability $2/7$ and start in state $(no,1)$ with probability $5/7$, multiplying our fundamental matrix by the corresponding vector, we get:

$(7/5+7/4+7/3+7/2+7)2/7 + (7/12+7/5+14/5+98/15+7/6+14/15+7/10+7/15+7/30)5/7 = \frac{303}{20}$

Since this calculation assumed however that we started in one of those two states however, the initial roll must also be considered, so our final expected time is found by adding one to it, giving $\frac{323}{20} = 16.15$ rolls on average.


As a side note, the method employed is the same as that of dr.hab, however with the use of technology and the language of matrices, we can conveniently write and compute the result.