Favourable modification of "Double or Nothing"

947 Views Asked by At

I am working through 'Great expectations: the theory of optimal stopping' by Y.S. Chow, H. Robbins, D. Siegmund and cannot fill in the gap in reasoning regarding the existence of optimal stopping rule. Let me describe the problem:

Favourable modification of "Double or Nothing".
A fair coin is tossed repeatedly. After each toss we have the option of stopping or going on to the next toss, our decision at each stage being allowed to depend on the outcomes thus far. We must stop after some finite (but not preassigned) number of tosses, and it is agreed that if we stop after the $n$th toss we are to receive a reward $X_n$. In "Double or Nothing", for the player starting with $1$ unit, the reward would be $\prod_{i=1}^n (Y_i+1)$ where $Y_i=1$ denotes heads on the $i$th toss and $Y_i=-1$ tails.
We, however, consider the case in which we are encouraged to continue playing by multiplying the above reward by the increasing sequence $2n/(n+1)$, yielding the reward sequence $$X_n=\frac{2n}{n+1}\prod_{i=1}^n (Y_i+1) \quad (n=1,2,\ldots),$$ which is $0$ if there are any tails in the first $n$ tosses and $2^{n+1}n/(n+1)$ otherwise.

Let me define $V=\sup_\tau\{\mathbb{E}X_\tau\}$ (supremum over the class of all possible stopping rules $\tau$ for which $\mathbb{E}X_\tau$ exists). We shall call $V$ the value of the sequence $\{X_n\}$. We are interested in finding $V$ and an optimal rule (i.e. stopping rule $\tau^*$ such that $V=\mathbb{E}X_{\tau^*}$) if it exists.

After defining the problem authors proceed to solve the problem:

(?) A little reflection will show that we need to consider only the class of stopping rules $\{\tau_k\}, k=1,2\ldots,$ where $\tau_k=k$; i.e. $\tau_k$ stops after the $k$th toss no matter what sequence of heads and tails has appeared. (?)
Clearly, $$\mathbb{E}X_{\tau_k}=\frac{1}{2^k}\cdot\frac{k2^{k+1}}{k+1}+\left(1-\frac{1}{2^k}\right)\cdot0=\frac{2k}{k+1},$$ and therefore $V=2$ but no optimal stopping rule exists.

Here is my problem: I have literally no reflection why is it the case that one can restrict only to the class of constant stopping rules. Precisely, why does the inequality $$\sup\{\mathbb{E}X_\tau\mid\tau=k\}\ge\sup\mathbb{E}X_\tau$$ hold? (The converse is clearly true.)

Thanks in advance for your attention!

1

There are 1 best solutions below

4
On

The reason is that we cannot possibly gain any useful information by playing. Whenever we get tails, it doesn't matter what we do from then on since the payoff is known to be $0$. So at each time step the only choice that matters is the one in the case where we've had only heads up to that point, and we can decide beforehand what to do if that one case occurs.