Optimal stopping criteria for reward proportional to average roll of an $m$-sided dice

55 Views Asked by At

Consider an $m$-sided dice. It can be rolled as many times as we wish. When we stop, we receive a reward equal to the average roll until that point. I would like to find out the optimal threshold (once the current average exceeds it, we stop rolling) that will maximize the expected reward.

From simulation, I get the following results for a $6$-sided dice.

threshold (θ)           Expected return (x)
4.1                     4.389 < x < 4.39
4                       4.396 < x < 4.397
3.9                     4.391 < x < 4.392
3.8                     4.3976 < x < 4.398
3.7                     4.397 < x < 4.3978

It seems like the optimal threshold lies in $(3.7,3.9)$ for a $6$-sided dice.

My first approach was to find the probability of exceeding a threshold $\theta$ by rolling an $m$-sided dice $n$ times. Let's denote it by $P(m,n,\theta)$. It can be calculated as $$P(m,n,\theta)=\frac{1}{m^n}\sum_{s=\lceil n\theta \rceil}^{mn}\sum_{k=0}^{\lfloor\frac{s-n}{m}\rfloor}(-1)^k\binom{n}{k}\binom{s-mk-1}{n-1}$$ The probability that we need to roll the $m$-dice exactly $n$ times to exceed the threshold is denoted by $Q(m,n,\theta)$. For example, $Q(6,3,3.8)=9/216=1/24$, as these are the $9$ possible sequences of $6$-dice rolls where the average threshold $3.8$ is exceeded exactly after $3$ rolls.

(1,5,6),(1,6,5),(1,6,6),(2,4,6),(2,5,5),(2,5,6),(3,3,6),(3,4,5),(3,4,6)

We denote the expected reward if we end rolling after $n$ rounds (at that point the threshold $\theta$ has been exceeded) by $R(m,n,\theta)$. So, the overall reward $R(m,\theta)$ can be calculated as $$R(m,\theta)=\sum_{n=1}^{\infty}Q(m,n,\theta)R(m,n,\theta)$$ I am not sure how to derive a closed form expression for $Q(m,n,\theta)$ and $R(m,n,\theta)$. If we manage to find the expression, we can possibly use some form of searching method to find the value of $\theta$ that maximizes $R(m,\theta)$.

I shall appreciate any help I can get from this community.