The rule of the game is as follows: The player rolls a fair six-sided dice repetitively until the game is over. After each roll, if the cumulative sum of all the rolls so far is a square number (1, 4, 9...), then the game is over, and the player gets nothing. If the cumulative sum of all the rolls is NOT a square number, the player can choose to either stop rolling and get paid the current cumulative sum amount of dollars; or she can choose to continue the game and roll again.
For example, let's say Alice plays this game and rolls a "2". It's not a square number. Alice chooses to continue the game. She rolls a "6" this time. Now the cumulative sum is 8. She can choose to keep rolling, or she can end the game and keep 8$ as her winning. Let's say she chooses to roll again and rolls a "1", then the cumulative sum becomes 9, a square number. The game ends and she gets no money.
My question is: what is the optimal strategy for this game, and what is the expectation of winnings if you play the optimal strategy? Or in other words, if you charge a ticket price to play this game, it should be at least how much?
My thoughts:
let $E_k$ denote your expected payoff if the current cumulative sum is $k$. $E_k=0 $ if $k=n^2$ where $n=1,2,...$
The ticket price would be $E_0=\frac{1}{6}(E_1+E_2+E_3+E_4+E_5+E_6)=\frac{1}{6}(E_2+E_3+E_5+E_6)$
And we have $E_2=max(2,\frac{1}{6}\sum^{i=6}_{i=1}E_{2+i}), E_3=max(3,\frac{1}{6}\sum^{i=6}_{i=1}E_{3+i}),...$ etc.
To me this looks like a dynamic programming problem, but I cannot find a termination point. At what cumulative sum should I stop rolling?
Let's say $N$ is a large integer, then for any cumulative sum $s$ where $(N-1)^2<s<N^2-6$, we would absolutely choose to roll again. But if we get cumulative sum $N^2-6\leq s<N^2$, do we also choose to roll gain?
For example, $s=N^2-1$, how do I know if $E_s=max(s,\frac{1}{6}\sum^{i=6}_{i=1}E_{s+i})=s>\frac{1}{6}\sum^{i=6}_{i=1}E_{s+i}?$
Numerically, you can iteratively compute $$G[n]:=\begin{cases} 0 & \text{if } n = m^2\\ \max(n, \frac{1}{6}(G[n+1] + \cdots G[n+6]) & \text{elsewhere} \end{cases} $$
with the initial values $G[n]=0$ until convergence.
The stop numbers are $30,31, 43, 44, 45, 58, 59, 60, 61, 62$ and the final range: $75 \cdots 80$
The global expected gain is $7.1754993$