QuantGuide Busted 6 II

86 Views Asked by At

This question is from QuantGuide(Busted 6 II):
Suppose you play a game where you continually roll a die until you obtain either a 5 or a 6. If you receive a 5, then you cash out the sum of all of your previous rolls (excluding the 5). If you receive a 6, then you receive no payout. You have the decision to cash out mid-game. What is your expected payout following the optimal strategy?
My Approach:
First I look into the case when we can't cash out mid-game. The expected value is 2.5 in this case. Now for the additional option of stopping midgame, we calculate the expected value at each stage of the dice throw. For the $i^{th}$ throw the expected value will be: \begin{equation} \frac{2}{3}^i(2.5i)+\frac{2}{3}^{i-1}(2.5(i-1))\frac{1}{6} \end{equation} The 2.5 value is due to each throw having the average value of the dice roll to be $\frac{1+2+3+4}{4}$(the first term is for when all the throws till now don't have 5 or 6 and the second term is for the case of landing with a 5 in the $i^{th}$ throw). The value I am getting is approximately 2.59.


The case without the option to cash out was solved at Roll until 5 or 6 is obtained on die without mid-game cash out.

1

There are 1 best solutions below

8
On

The optimal strategy clearly takes the form of playing until you’ve obtained some threshold value and then cashing out, so we need to determine the threshold. At the last value under the threshold, you know you’re going to cash out if you don’t roll a $5$ or $6$. So the threshold is determined by the condition that a single roll will decrease the expected payout.

If your current sum is $s$, the expected payout if you cash out after the next roll is

$$ \frac16\cdot0+\frac16\cdot s+\frac46\cdot\left(s+\frac{1+4}2\right)\;. $$

This is equal to $s$ for $s=10$. So if you have a sum of $10$, it doesn’t matter whether you roll once more or not, if you have less you should continue, and if you have more you should cash out.

The additional payout $a_s$ you expect to gain when you have a sum of $s$ is

\begin{eqnarray*} a_s &=& \frac16\cdot(-s)+\frac16\cdot0+\frac16\sum_{k=1}^4(k+a_{s+k}) \\ &=& \frac{10-s+\sum_{k=1}^4a_{s+k}}6\;, \end{eqnarray*}

with the “final values” $a_k=0$ for $k\ge10$. A linear ansatz yields the particular solution $a_s=\frac{5-s}2$, and writing $a_s=b_s+\frac{5-s}2$ leads to the homogeneous recurrence

$$ b_s=\frac16\sum_{k=1}^4b_{s+k}\;, $$

now with the “final values” $b_{10}=\frac52$, $b_{11}=3$, $b_{12}=\frac72$, $b_{13}=4$. As this is a fourth-order linear recurrence, it can be solved analytically, but the solution isn’t very helpful. Here’s a table with the results of working backwards from the threshold:

\begin{array}{c|c} s&b_s\\\hline 13&4\\ 12&\frac72\\ 11&3\\ 10&\frac52\\ 9&\frac{13}{6}\\ 8&\frac{67}{36}\\ 7&\frac{343}{216}\\ 6&\frac{1753}{1296}\\ 5&\frac{9031}{7776}\\ 4&\frac{46369}{46656}\\ 3&\frac{237751}{279936}\\ 2&\frac{1219729}{1679616}\\ 1&\frac{6266215}{10077696}\\ 0&\frac{32159329}{60466176}\\ \end{array}

Thus, the value you gain from the option to cash out is

$$b_0=\frac{32159329}{60466176}\approx0.5318565\;,$$

and the total value of the game with the option is

$$ a_0=\frac52+b_0=\frac{183324769}{60466176}\approx3.0318565\;. $$

Here’s the code I used to solve the recurrence, and here’s the code I used to simulate the experiment to check the results. There you can also change the threshold to check that $10$ is indeed optimal.