You are given a 100-sided die. After you roll once, you can choose to either get paid the dollar amount of that roll OR pay one dollar for one more roll. What is the expected value of the game? (There is no limit on the number of rolls.) Assume that the player always play optimally.
I know this has been asked before. But I am really not satisfied with the solutions.
Here's my approach and the reason why I am not quite satisfied.
(I am assuming that at each step the threshold may be different. Threshold is $k$, so that if a an outcome$\leq k$, I would reroll, else, I would not.)
Let $x_1$ be the expected value of the game before step $1$.
Let $x_2$ be the expected value of the game after $1$ roll (and after paying the 1 dollar).
And so on. So we basically need the max value of $x_1$. So for that, the threshold should be equal to $x_{n+1}-1$ for calculating $x_n$. ($-1$ due to the dollar deducted)
Now $$x_1=\frac{\lfloor x_2-1 \rfloor(x_2-1)}{100}+\frac{(100-\lfloor x_2-1 \rfloor)}{100}.\frac{(101+\lfloor x_2-1 \rfloor)}{2}$$.
Let there be a finite number of trails allowed say $N$. So we can recursively calculate $x_1$ in terms of $x_N$. Now for this problem, we want $x_1$ as $N\to\infty$. Simulating with python, the answer comes to be $\approx 87.35713841271844$, which is really close to that given by the approaches I was dissatisfied by.
But in those approaches, the threshold values at each step was taken to be the same.
They assumed that after paying $1$ dollar in case of a re-roll, we would get a similar problem, hence same expected value. But my doubts arose here, since after paying one dollar, how do we know that we would be using the same threshold again? That is, how do know that the expected value $x_1$, $x_2$, ... would be equal?
Also one more interesting thing to note is that once a player rolls $100$ times, then there is no point asking for a roll again. Since he would have paid $99$ dollars till that point. And the max he can get is $100$ dollars. So should $N$ really tend to $\infty$. But this is where the most interesting part comes. When I simulate this using python, for N=100, I get:
x100: 50.5
x99: 62.504999999999995
x98: 69.10804999999999
x97: 73.353474
x96: 76.31450128
x95: 78.48587596
x94: 80.1341244892
x93: 81.415958346468
.
.
.
x6: 87.35713340950431
x5: 87.3571347321737
x4: 87.35713586966939
x3: 87.35713684791568
x2: 87.35713768920749
x1: 87.35713841271844
In fact, for $N>217$, these values seem to be equal $\approx 87.35...$. So what does that mean? Keeping in mind the fact that one would not want to roll once he reaches $99$ re rolls.