Generalization of a common die-rolling game; looking for a limit

120 Views Asked by At

A common interview question (which appears at least once on this board) is the following:

You are playing a game in which a single fair 6-sided die is rolled repeatedly, up to a maximum of N rolls. You can stop the process any time you wish, at which point your score is the last value rolled. What is your expected score as a function of N, assuming optimal play?

The answer is found recursively: For N=1 the answer is 3.5. For larger N the answer for N-1 determines whether to stop after the first roll, from which we can calculate the answer for N.

Now consider just that first roll. Obviously if the first roll is a 6, you stop. Suppose the first roll is 5. Do you stop? The answer depends on N. Clearly for N=2 you stop if 5 is the first roll, and just as clearly you do not stop on an initial 5 if N is very large, because you're so likely to get a 6 eventually. How large must N be so that you will stop on the first roll only if that roll is 6? The answer is 6. That is, if the first roll is 5 you should stop if N<=5 and continue if N>=6.

Now generalize to a k-sided die: Let g(k) be the smallest N such that the optimal strategy for a k-sided die is to stop after the first roll only when that roll is k. We've seen that g(6)=6. Is this a coincidence? In fact it is: g(k)=k up to k=9, but g(10) = 11.

So consider the ratio g(k)/k. As k increases, this ratio increases (though not monotonically). Numeric calculation suggests that it approaches a limit, and that limit might be e/2. Can we prove this limit, or some other limit, or that the limit doesn't exist?

2

There are 2 best solutions below

0
On

Yes there is a limit. I don't have a final double-checked answer, but my calculations suggest that it is nothing nearly so simple.

So I'll just outline the argument and let you fill it in.

Let $g(k, m)$ be the number at remaining rolls at which you'll only take the top $m$ values on a $k$-sided dice if you see them. This is a generalization of your $g$ in that it has two parameters, but they are connected by $g(k) = g(k, 1)$.

The first easy result is that $g(k, m)$ is bounded above by $O(k \log(k) / (m - 1))$. To see that, it suffices to note that the optimal strategy gets better results than the suboptimal strategy of accepting the first dice in the top $m-1$, or 0 if you don't get one. But in any $k/(m - 1)$ rolls that suboptimal strategy has probability $>0.5$ of getting a value in the top $m-1$. If we do that $\log_2(k)$ times, our probability of failing to get a value in the top $m-1$ is $< \frac{1}{k}$ and our expected results are therefore good enough that we would be better off not accepting the $m+1$'th best value but instead doing this suboptimal thing.

What does this do for us? It lets us break out a telescoping series like so then bound the last term: $$ g(k, 1) = (g(k, 1) - g(k, 2)) + (g(k, 2) - g(k, 3)) + \ldots + (g(k, m-1) - g(k,m)) + g(k, m)$$

My hasty calculation that I don't wish to type up is that $g(k, i) - g(k, i+1) = k \log(1 + 2/i) / (i+1) + O(1)$. I probably made a calculation error. But the principle of the calculation is that at $g(k, i+1)$ the expected value of continuing just reached or passed $k - i - 1$ which is why we now only stop at the $i+1$ values $k, k-1, \ldots k-i$. How many rolls on this strategy does it take so that the expected value passes $k-i$?

Note that the terms of this series are $O(1/i^2)$ which is a converging series. It converges to something. (If my calculation is right, approximately $2.118347...$) So we take the above telescoping series, expand out $\sqrt{k \log(k)}$ terms, add/subtract the tail, and we'll get $C + O(\sqrt{k \log(k)})$ which is definitely a sequence that converges to the infinite sum $C$.

0
On

Not an answer but a reformulation:

For some fixed integer $k>1$, let $b_1=\frac{k+1}{2}$, and for $t=2,3, \cdots $ define the recursion:

$$a_t = \lceil b_{t-1}\rceil$$ $$c_t = 1-\frac{a_t}{k}$$ $$b_t = \left(c_t+\frac1k \right)\,\frac{a_t+k}{2}+\left(1-c_t -\frac1k\right)\,b_{t-1} \tag{1}$$

Let $g$ be the minimum $t$ such that $c_g=0$ (or $b_g\ge k-1$) . We are interested in $$\lim_{k\to \infty} \frac{g(k)}{k}$$


Update: here's an approximation.

Let assume the fractional part of $b_t$ is randomly distributed, and hence approximate $a_t = b_{t-1}+\frac12$

Replacing on $(1)$ we get

$$b_t =b_{t-1}+\frac{1}{2k}\left(k+\frac12 -b_{t-1}\right)^2 \tag{2}$$

Letting $\alpha=\frac{1}{2k}$ and $\beta = k+\frac12$, letting $b(t)$ a continuous function and replacing difference by derivative, we get the differential equation

$$b'(t)=\alpha(\beta-b(t))^2$$

which gives

$$b(t)=\beta-\frac{1}{\alpha(t+\gamma)} \tag{3}$$

for some constant $\gamma$. We expect this approximation to work for not $t$ not too small nor too big (recall that $b(t)$ grows towards $k$, so our mean approximation will $\lceil b\rceil \approx b+\frac12$ will break some time after $b(t)=k-1$ - but fortunately that's our range of interest).

Now, $$b(t^*)=k-1 \implies t^* =\frac43 k -\gamma \tag{4}$$

What remains is to check that $\gamma$ does not depend (or depends very weakly) on $k$. A crude way is to replace $b(1)=\frac{k+1}{2}$ on $3$, which gives $\gamma=3$.

According to this (not well justified) approximation, the ratio $g(k)/k$ should tend to $4/3=1.3333\cdots$. Numerical computations suggest a slightly higher value

k  g(k)      g(k)/k
6   6       1.0000
15  17      1.1333
50  63      1.2600
101 131     1.2970
200 265     1.3250
399 533     1.3358
800 1075    1.3438
999 1344    1.3453