Reference Needed: $m$-faced dice with $n$ options

248 Views Asked by At

I would like to investigate the following question and I am not sure if similar ideas have been investigated. I cannot really find a detail discussion about this matter on the Internet. It will be great if you can tell me what domain I should look into, or you may also give your view on the following questions.


Suppose we have one fair dice. We would like to choose an option randomly from 6 possible options. We roll the dice once then we can decide which option we should choose.


Suppose we now have 8 options. To choose an option randomly, different algorithms can be used. For example, roll the dice once first.

If the result is odd, we focus on option 1 to 4. Then roll the dice again, if we get 1 to 4, we know which option to use. If it is a 5 or 6, repeat the second step again.

If the result is even, we focus on option 5 to 8. Then roll the dice again, if we get $n$, which $1\leq n\leq 4$, we choose option $n+4$. If it is a 5 or 6, repeat the second step again.

One can check the the probability of choosing each option is exactly $\frac{1}{8}$.


My question now is to investigate if it is possible that for whatever number of options we have, we can always use a dice to choose an option randomly. And is there a general algorithm to follow in order to choose an option?

My second question is more advance. Suppose we have a fair $m$-faced dice and we have $n$ total options. Can we choose an option randomly by using the dice for any $m$ and $n$?


Once again, you don't have to answer the above two questions. I am looking for hints or domain that I should look into. Thank you.

4

There are 4 best solutions below

4
On

Given an $m$-faced die, you can simulate an $n$-faced die, for any values of $m,n>1$.

Suppose the faces of your $m$-die are $0,1,\dots,m-1$. Then write out $0,1,\dots,n-1$ in base $m$. So each face of your $n$-die is associated with a base-$m$ string of length $\ell=\lceil\log_m(n)\rceil$.

Then roll your $m$-die a total of $\ell$ times consecutively, and record each die roll. This gives you a base-$m$ number with $\ell$ digits. If this matches up with one of $0,1,\dots,n-1$, great. If not, repeat the process again.

It's clear that each of the values $0,\dots,n-1$ are equally likely to show up. Furthermore, you will almost surely require a finite number of rolls. In fact, in expectation you will need $m^\ell/n<m$ rounds.


As a concrete example, let's take $m=2$ and $n=7$. Roll our $2$-sided die thrice, the list of outcomes is $$000,001,010,011,100,101,110,111.$$The first $7$ of these correspond to the faces $0,\dots,6$ of the $7$-sided die. If we roll three $1$s, then we simply ignore the result and go again.

4
On

Here's another method based on similar ideas to those used in jlammy's nice answer.

Let $\ b_i=\frac{i}{n}\ $ for $\ i=0,1,\dots,n\ $, $\ D_j\ $ the face observed on the $\ j^\text{th}\ $ toss of your $\ m$-die (with faces labelled $\ 0,1,\dots m-1\ $) for $\ j=1,2,\dots\ $, and $\ B_t=\sum_\limits{j=1}^tD_jm^{-j}\ $. Keep tossing the die until there is an integer $\ k=0,1,\dots,n-1\ $ such that $\ b_k\le B_t\le b_{k+1}-m^{-t}\ $. With probability $1$ this will occur after only a finite number of tosses, and the integer $\ k\ $ is then equally likely to be any of $\ 0,1,\dots,n-1\ $.

The integer $\ k\ $ will not exist after $\ t\ $ tosses if and only if $\ D_1, D_2, \dots, D_t\ $ are the first $\ t\ $ digits in the base-$m$ expansion of some $\ b_i\ $ with $\ 0\le i\le n-1\ $. The probability of this being the case is $\ \frac{n_t}{m^t}\ $, where $\ n_t\ $ is the number of distinct such leading $\ t$-digit sequences among the base-$m$ expansions of the numbers $\ b_0,b_1, \dots, b_{n-1}\ $. The probability that $\ k\ $ will be determined on the $\ t^\text{th}\ $ toss is thus $\ \frac{n_{t-1}}{m^{t-1}}-\frac{n_t}{m^t}\ $ for $\ t>1\ $, or $\ 1-\frac{n_1}{m}\ $ for $\ t=1\ $, and the expected number of tosses it will take to determine $\ k\ $ is $$ \sum_{t=1}^\infty\frac{n_t}{m^t}\le n\sum_{t=1}^\infty m^{-t}=\frac{n}{m-1}\ . $$ One advantage of this method is that it generalises easily to the case where the $\ n\ $ options are not necessarily equally likely. If $\ p_i\ $ is the probability of option $\ i\ $, and $\ b_i\ $ is defined by $$ b_i=\cases{0&if $ i=0$\\ \sum_\limits{j=0}^{i-1}p_j&if $\ 1\le i\le n\ $,} $$ then the same method will generate the integer $\ k\ $ with probability $\ p_k\ $.

Example (as requested by OP)

If $\ n=7 $ and $\ m=3\ $, then $\ b_i=\frac{i}{7}\ $ for $\ i=0,1,\dots7\ $.

  1. First toss the die and calculate $\ B_1\ $. Suppose, for example that the first toss results in $\ D_1=1\ $. Then $\ B_1=\frac{D_1}{3}=\frac{1}{3}\ $.
  2. Next, check to see whether $\ B_1\ $ satisfies the condition $\ b_k\le B_1\le b_{k+1}-\frac{1}{3}\ $ for any integer $\ k\ $. If there is such an integer, then it will be unique, and you can stop with that as your randomly selected choice. For $\ B_1=\frac{1}{3}\ $ we have $\ \frac{2}{7}=b_2<B_1<b_3=\frac{3}{7}\ $, but $\ B_1\not\le b_3-\frac{1}{3}=\frac{2}{21}\ $, so there is no integer $\ k\ $ satisfying the stopping condition $\ b_k\le B_1\le b_{k+1}-\frac{1}{3}\ $, and we have to toss the die again. Suppose the result of the second toss is $\ D_2=0\ $. Then $\ B_2=\frac{1}{3}+\frac{0}{9}=\frac{1}{3}\ $.
  3. Now check to see whether $\ B_2\ $ satisfies the condition $\ b_k\le B_2\le b_{k+1}-\frac{1}{9}\ $ for any integer $\ k\ $. For $\ B_2=\frac{1}{3}\ $, as before, we have $\ \frac{2}{7}=b_2<B_2<b_3=\frac{3}{7}\ $, but $\ B_2\not\le b_3-\frac{1}{9}=\frac{19}{63}\ $, so we have to toss the die again. Suppose the result of the third toss is $\ D_3=1\ $. Then $\ B_3=\frac{1}{3}+\frac{0}{9}+\frac{1}{27}=\frac{10}{27}\ $.
  4. On now checking the stopping condition, we find that $\ b_2\le B_3\le b_3-\frac{1}{27}=\frac{74}{189}\ $, so it is satisfied by $\ k=2\ $ and this becomes our randomly chosen option.

The procedure becomes easier to understand and execute if we use ternary numerals to represent the numbers $\ b_i\ $: $$ \begin{align} b_1=\frac{1}{7}&=0.\overline{010212}_3\\ b_2=\frac{2}{7}&=0.\overline{021201}_3\\ b_3=\frac{3}{7}&=0.\overline{102120}_3\\ b_4=\frac{4}{7}&=0.\overline{120102}_3\\ b_5=\frac{5}{7}&=0.\overline{201021}_3\\ b_6=\frac{6}{7}&=0.\overline{212010}_3\ . \end{align} $$ As long as the results, $\ D_1,D_2,\dots,D_t\ $, of the first $\ t\ $ tosses of the die duplicate the first $\ t\ $ ternary digits of any of the numbers $\ b_i\ $ for $\ 1\le i\le 6\ $, the stopping condition will not be satisfied and we have to toss again. If, however, $\ D_1,D_2,\dots,D_t\ $ duplicate the first $\ t\ $ ternary digits of $\ b_k\ $ (say) but $\ D_1,D_2,\dots,D_{t+1}\ $ do not duplicate the first $\ t+1\ $ ternary digits of any of $\ b_{k-1}, b_k\ $ or $\ b_{k+1}\ $ then we must have either $\ b_{k-1}\le B_{t+1}\le b_k-3^{-(t+1)}\ $ (if $\ D_{t+1}\ $ is less than the $\ (t+1)^\text{th}\ $ ternary digit of $\ b_k\ $), in which case $\ k-1\ $ will be our randomly chosen option, or $\ b_k\le B_{t+1}\le b_{k+1}-3^{-(t+1)}\ $ (if $\ D_{t+1}\ $ is greater than the $\ (t+1)^\text{th}\ $ ternary digit of $\ b_k\ $), in which case $\ k\ $ will be our randomly chosen option.

Here are a few further examples to illustrate the procedure:

  1. Tosses $\ 0,0\ $ give $\ k=0\ $.
  2. Tosses $\ 1,2,0,2\ $ give $\ k=4\ $.
  3. Tosses $\ 1,2,0,0\ $ give $\ k=3\ $.
  4. Tosses $\ 2,0,1,0,1\ $ give $\ k=4\ $.
  5. Tosses $\ 2,1,2,1\ $ give $\ k=6\ $.

Why does it work?

  • First note that the sequence $\ \left\{B_t\right\}_{t=1}^T\ $ is non-decreasing, so if $\ b_k\le B_\tau\ $ for some $\ \tau<T\ $, then $\ b_k\le B_t\ $ for all $\ t\ge\tau\ $.
  • Next, if $\ B_\tau\le b_{k+1}-m^{-\tau}\ $, then $\ B_t<b_{k+1}\ $ for all $\ t\ge\tau\ $, because $\ \sum_\limits{s=\tau+1}^tD_sm^{-s}<m^{-\tau}\ $.
  • If $\ b_{k+1}-m^{-\tau}<B_\tau\le b_{k+1}\ $ then $\ D_1, D_2, \dots, D_\tau\ $ must be the first $\ \tau\ $ digits of the base-$m$ expansion of $\ b_{k+1}\ $. Thus, if you toss again, and the result, $\ D_{\tau+1}\ $, is greater than the $\ (\tau+1)^\text{th}\ $ digit of the base-$m$ expansion of $\ b_{k+1}\ $, then you will have $\ B_{\tau+1}>b_{k+1}\ $

The upshot of these observations is that if the stopping rule is satisfied at $\ t=\tau\ $, then you can be sure that $\ B_t\ $ will lie in the interval $\ \left[b_k, b_{k+1}\right)\ $ for all $\ t\ge\tau\ $, no matter how large you make $\ t\ $. If it is not satisfied, however, then even if $\ b_k \le $$B_\tau<$$b_{k+1}\ $, it is still possible that if you continue tossing you will get $\ B_t\ge b_{k+1}\ $ for some $\ t>\tau\ $.

Now if $\ \tau $ is the time determined by the stopping rule, and $\ c=$$\sum_\limits{i=1}^\infty c_im^{-i}\ $, with $\ c_i\in\{0,1,\dots,m-1\}\ $ for all $\ i\ $, is any of the numbers $\ b_0,b_1,\dots,b_n\ $, then $$ \begin{align} \left\{B_\tau<c\right\}&=\left\{D_1<c_1\right\}\cup\\ &\hspace{2em}\bigcup_\limits{i=2}^\infty\left\{D_i<c_i, D_{i-1}=c_{i-1},\dots,D_1=c_1-1\right\}\ . \end{align} $$ The events $\ \left\{D_i<c_i, D_{i-1}=c_{i-1},\dots,D_1=c_1-1\right\}\ $ and $\ \left\{D_1<c_1\right\}\ $ are mutually exclusive, and $$ \begin{align} P\big(D_1<c_1\big)&=\frac{c_1}{m}\\ P\big(D_i<c_i, D_{i-1}=c_{i-1},\dots,D_1=c_1-1\big)&=\left(\frac{1}{m^{i-1}}\right)\left(\frac{c_i}{m}\right)\\ &=\frac{c_i}{m^i}\ \text{ for }\ i\ge2\ . \end{align} $$ Thus $$ P\big(B_\tau<c\big)=\sum_{i=1}^\infty\frac{c_i}{m^i}=c\ , $$ and therefore $$ \begin{align} P\big(b_k\le B_\tau<b_{k+1}\big)&=b_{k+1}-b_k\\ &=p_k\ , \end{align} $$ if $\ b_k\ $ are defined as above, because $\ \big\{B_\tau<b_k\big\}\subseteq$$\big\{B_\tau<b_{k+1}\big\}\ $.

4
On

We are given a die of $m$ faces, and $n$ options to choose from. Like in jlammy's answer, let $\ell$ be the number of die rolls. There are then $m^\ell$ possible outcomes, so we certainly need $m^\ell \geq n$ so that every option has a chance of being chosen.

This answer will tweak jlammy's answer, to be more efficient. For example, suppose $m=6$ and $n=7$. If we use $\ell= 2$, there will be $6^2 = 36$ outcomes. In jlammy's answer, only $7$ of these are used, one for each of the $n=7$ options. So there is a $\frac{29}{36}$ chance of needing to re-roll. Instead, we'd like to assign $5$ outcomes to each of the $7$ options. That way we use $5\cdot 7 = 35$ of the $36$ outcomes, so there is only a $\frac{1}{36}$ chance of needing to re-roll.

To do this, divide $m^\ell$ by $n$, using integer division, so we get a quotient $q$ and a remainder $r$. Thus $m^\ell = qn+r$, where $0\leq r < n$. Also, remember that we choose $\ell$ such that $m^\ell \geq n$, which implies $q\geq 1$ (for, if $q\leq 0$ then $m^\ell = qn+r \leq r < n$). In particular, we have $0\leq r < qn$.

The idea is that each of the $n$ options will be assigned $q$ separate outcomes, with $r$ outcomes left over so the probability of a re-roll will be $\frac{r}{m^\ell}$. To assign the outcomes, write the die faces as $0, 1, \ldots, m-1$, and the options as $0, 1, \ldots, n-1$. Now the sequence of $\ell$ die rolls corresponds to an integer written in base-$m$. Call this number $Z$, so $0\leq Z \leq m^\ell-1$.

If $Z\geq qn$, then re-roll. Otherwise, assume $0\leq Z\leq qn-1$. Then divide $Z$ by $q$, and record the quotient (discard the remainder). That new quotient is the option we select.

Let $p = \frac{r}{m^\ell}$ be the probability of re-rolling, so $0\leq p < 1$. The expected number of die rolls is $$\begin{eqnarray*} E & =& (1-p)(\ell) + p(1-p)(2\ell) + p^2(1-p)(3\ell) + \ldots\\ & =& \ell (1-p) \left( \sum_{k=1}^\infty k p^{k-1} \right)\\ & =& \ell (1-p) \frac{d}{dp} \sum_{k=1}^\infty p^k\\ & =& \ell (1-p) \frac{d}{dp} \left(\frac{1}{1-p} - 1 \right) \\ & =& \frac{\ell}{1-p}, \end{eqnarray*} $$ since $0\leq p < 1$. Indeed, $$ E = \frac{\ell}{1-\frac{r}{m^\ell}} = \ell \left(\frac{ m^\ell}{m^\ell - r} \right) = \ell \left( \frac{qn+r}{qn}\right) = \ell\left(1 + \frac{r}{qn}\right). $$ Since we know $r< qn$, we get $E< 2\ell$. But there are $\ell$ die rolls per round of rolling, so the expected number of rounds is less than $2$. (Compare with the bound of $m$ rounds in jlammy's answer.)

Also, notice that, in terms of minimizing expected number of die rolls, choosing $\ell = \left\lceil \log_m (n)\right\rceil$ may not be optimal. Above, we have only assumed that $m^\ell \geq n$, so $\ell \geq \left\lceil \log_m(n)\right\rceil$. We are allowed to choose larger $\ell$ if we want.

For example, suppose $m=6$ and $n=19$. Then choosing $\ell=2$ gives us $6^2 = 36 = 1\cdot 19 + 17$ outcomes, with a $\frac{17}{36}$ chance of a re-roll. The expected number of rolls is $$ 2 \left( \frac{36}{19} \right) = \frac{72}{19}\approx 3.789. $$ But with $\ell=3$, we have $m^\ell=216 =11\cdot 19 + 7$. So we have expectation $$ 3\left(\frac{216}{209} \right) = \frac{648}{209} \approx 3.100. $$ In fact $\ell=3$ is optimal for this example. We must choose $\ell\geq 2$, and notice that we always have $E\geq \ell$. Thus since we have already found an expectation less than $4$, there is no need to try $\ell$-values that are $4$ or more. So the only possibilities (for an optimal $\ell$-value) are $\ell=2$ and $\ell=3$, and we have seen that $\ell=3$ is better.

0
On

A further tweak to mathmandan's scheme is as follows.

For fixed $m,n,\ell$, allocate sequences of $\ell$ numbers from $1,\ldots,m$ to outcomes as before, i.e. choose the largest $k$ such that $kn\leq m^{\ell}$ and allocate $k$ of the smallest (lexicographically) $kn$ sequences to each of $1,\ldots,n$ and the others to "reroll". However, instead of rolling a whole batch $\ell$ and then rerolling if necessary, roll up to $\ell$ dice one by one and stop as soon as you know you will need to reroll. As well as improving the expectation, this can actually change which choice of $\ell$ is best.

For example, with $m=6,n=20$, mathmandan's approach gets $3.6$ with $\ell=2$ and $3.24$ with $\ell=3$, so prefers the latter. However, with $\ell=2$ the reroll combinations are $(4,3),(4,4),\ldots,(6,6)$, so if the first die of the first pair is 5 or 6, you don't need to roll the other one. The expected number of rolls with this modification, $E$, satisfies $$E=\frac{1}{3}(1+E)+\frac{4}{36}(2+E)+\frac{20}{36}\times2,$$ which gives $E=3$.

(This tweak will also improve the expectation for $\ell=3$ slightly, since you can stop and reroll if your first two rolls are $6,5$ or $6,6$, but it will clearly be worse than $E=3$ for $\ell=2$.)