Optimal strategy to maximize cumulative sum of dice rolls but the sum cannot be a square number

232 Views Asked by At

The rule of the game is as follows: The player rolls a fair six-sided dice repetitively until the game is over. After each roll, if the cumulative sum of all the rolls so far is a square number (1, 4, 9...), then the game is over, and the player gets nothing. If the cumulative sum of all the rolls is NOT a square number, the player can choose to either stop rolling and get paid the current cumulative sum amount of dollars; or she can choose to continue the game and roll again.

For example, let's say Alice plays this game and rolls a "2". It's not a square number. Alice chooses to continue the game. She rolls a "6" this time. Now the cumulative sum is 8. She can choose to keep rolling, or she can end the game and keep 8$ as her winning. Let's say she chooses to roll again and rolls a "1", then the cumulative sum becomes 9, a square number. The game ends and she gets no money.

My question is: what is the optimal strategy for this game, and what is the expectation of winnings if you play the optimal strategy? Or in other words, if you charge a ticket price to play this game, it should be at least how much?

My thoughts:
let $E_k$ denote your expected payoff if the current cumulative sum is $k$. $E_k=0 $ if $k=n^2$ where $n=1,2,...$

The ticket price would be $E_0=\frac{1}{6}(E_1+E_2+E_3+E_4+E_5+E_6)=\frac{1}{6}(E_2+E_3+E_5+E_6)$

And we have $E_2=max(2,\frac{1}{6}\sum^{i=6}_{i=1}E_{2+i}), E_3=max(3,\frac{1}{6}\sum^{i=6}_{i=1}E_{3+i}),...$ etc.

To me this looks like a dynamic programming problem, but I cannot find a termination point. At what cumulative sum should I stop rolling?

Let's say $N$ is a large integer, then for any cumulative sum $s$ where $(N-1)^2<s<N^2-6$, we would absolutely choose to roll again. But if we get cumulative sum $N^2-6\leq s<N^2$, do we also choose to roll gain?

For example, $s=N^2-1$, how do I know if $E_s=max(s,\frac{1}{6}\sum^{i=6}_{i=1}E_{s+i})=s>\frac{1}{6}\sum^{i=6}_{i=1}E_{s+i}?$

1

There are 1 best solutions below

2
On

Numerically, you can iteratively compute $$G[n]:=\begin{cases} 0 & \text{if } n = m^2\\ \max(n, \frac{1}{6}(G[n+1] + \cdots G[n+6]) & \text{elsewhere} \end{cases} $$

with the initial values $G[n]=0$ until convergence.

The stop numbers are $30,31, 43, 44, 45, 58, 59, 60, 61, 62$ and the final range: $75 \cdots 80$

The global expected gain is $7.1754993$

N   EG
1   0
2   10.05869783
3   8.621741003
4   0
5   11.79738564
6   12.57517138
7   13.31440987
8   14.04347910
9   0
10  14.77969208
11  16.07156142
12  17.24188584
13  17.74984080
14  18.41789449
15  19.19696993
16  0
17  23.82277745
18  24.26383236
19  20.79757059
20  22.42621663
21  23.87142254
22  25.18243387
23  26.39518873
24  26.91016177
25  0
26  32.19809286
27  32.54265800
28  33.04850185
29  33.67171793
30  30
31  31
32  32.92567941
33  34.61004882
34  36.08356496
35  37.41101439
36  0
37  44.78047216
38  44.66897614
39  44.71626526
40  44.92466180
41  45.37571097
42  45.99721882
43  43
44  44
45  45
46  46.17504099
47  48.08200603
48  49.72626593
49  0
50  59.75126693
51  59.87961151
52  59.61109558
53  59.52379621
54  59.59182532
55  59.79299313
56  60.10827983
57  60.64967899
58  58
59  59
60  60
61  61
62  62
63  63.89807398
64  0
65  76.63292805
66  76.65163674
67  76.71330768
68  76.82569230
69  76.56487911
70  76.48418209
71  76.55787037
72  76.76388888
73  77.08333333
74  77.5
75  75
76  76
77  77
78  78
79  79
80  80