Optimal strategy for a game based on exponentially distributed wait times

Question

Optimal strategy for a game based on exponentially distributed wait times

67 Views Asked by Bumbble Comm At 23 Apr 2026 - 8:40

Here is the game:

There is a hidden counter, which you know has value $0$ at some starting time. The value on the counter increases by $1$ after waiting an exponentially distributed amount of time (with, say, parameter $\lambda$ that you also know), and this repeats indefinitely. So essentially, with no other information, the value on the counter at any time is Poisson distributed. An example of something approximately like this might be a Geiger counter. The aim of the game is to observe the counter when it has counted to exactly $N$, using as few observations as possible.

You may observe the counter at any point (get a single value from it at a given time), and you may reset the counter to $0$ at any point (presumably it is only optimal to do this when you "miss" your target of $N$). You finish the game when you observe the value $N$, and your "score" is the number of observations you made from the start of the game (resetting incurs no penalty, other than observations made up to that point being "wasted" as they still count to the score).

I would like to find, ideally, the optimal strategy; the one that minimises the expected number of observations. But I would be satisfied with a strategy that is optimal among some collection of reasonably obvious strategies, since I find it difficult to approach the question of minimising over all strategies. Also, I think I need a good way to calculate the expected number of observations for a given strategy (this would be a functional, on the set of functions summarising strategies, as described below).

One thing that may be useful is that the game is in some sense memoryless; the counter is not affected by anything that has happened before, and so it is quite easy to see that your decision on how long to wait before observing the counter should only be affected by your most recent observation (this would be different if you did not know $\lambda$). Therefore any strategy is simply a combination of knowing when to reset, which is when you go over $N$, and a single function $f:\Bbb N_0\rightarrow\Bbb R_+$ which tells you how long to wait until the next observation given the value of the most recent observation. Note that you in some sense "observe" the counter immediately, since you are given that it shows $0$ at the start, so the first wait time (and the wait time after any reset) is the value $f(0)$.

Some strategies:

The greedy way to play is to, at every observation, wait for the modal time for the counter to reach $N$; i.e. wait the time which is most likely to immediately end the game when you observe the counter. However, I do not think this is optimal; a lot of the time (on the order of 50% of the time), you will go over $N$ and have to reset, which is not good; the closer you can get to $N$ without going over, the easier it is to finish from that point on (there are less integers between the current count, and $N$)

More conservative strategies would be to take that modal time for the counter to reach $N$, and wait for "a bit" less time; likely something on the order of the square root of that time less, since that is the rough size of deviations from the mean. That way, you drastically increase your chance of not needing to reset, however you drastically decrease your chance of seeing $N$. So overall, you probably won't have to reset the counter, but you'll waste a lot of observations by checking the counter when you know it likely hasn't reached $N$ yet.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Great problem. I don’t see how to do this for general $N$, but I’ll do it for $N=1$ and $N=2$ to show the general principle and then give some numerical results. I’ll set $\lambda=1$ to simplify things; the resulting times just have to be multiplied by $\lambda^{-1}$.

For $N=1$, the probabilities after observing a counter value $C=0$ and waiting for time $t$ are $P(C=0)=\mathrm e^{-t}$, $P(C=1)=t\mathrm e^{-t}$, and $P(C\gt1)=1-\mathrm e^{-t}-t\mathrm e^{-t}$. Thus the expected number $X$ of observations required to observe $C=N=1$ if we wait for $t$ every time is

$$ X=1+\left(1-t\mathrm e^{-t}\right)X\;, $$

with solution

$$ X=\frac{\mathrm e^t}t\;. $$

As expected, this goes to $\infty$ both for $t\to0$ and for $t\to\infty$. Setting the derivative to $0$ yields

$$ \frac{\mathrm e^t}t-\frac{\mathrm e^t}{t^2}=0\;, $$

so the optimal waiting time is $1$ and the expected number of observations required is $\mathrm e$. (In this simple case we could also have just maximized $P(C=1)=t\mathrm e^{-t}$, but that wouldn’t work for higher $N$.)

For $N=2$, the probabilities after observing $C=0$ and waiting for time $t_0$ are $P(C=0)=\mathrm e^{-t_0}$, $P(C=1)=t_0\mathrm e^{-t_0}$, $P(C=2)=\frac{t_0^2}2\mathrm e^{-t_0}$ and $P(C\gt2)=1-\left(1+t_0+\frac{t_0^2}2\right)\mathrm e^{-t_0}$, whereas after observing $C=1$ and waiting for time $t_1$ they are $P(C=1)=\mathrm e^{-t_1}$, $P(C=2)=t_1\mathrm e^{-t_1}$ and $P(C\gt2)=1-\left(1+t_1\right)\mathrm e^{-t_1}$. Thus, denoting by $X_0$ and $X_1$ the expected number of observations required after observing $C=0$ and $C=1$, respectively, we have

\begin{eqnarray} X_0 &=& 1+\left(1-\left(t_0+\frac{t_0^2}2\right)\mathrm e^{-t_0}\right)X_0+t_0\mathrm e^{-t_0}X_1\;, \\ X_1 &=& 1+\left(1-(1+t_1)\mathrm e^{-t_1}\right)X_0+\mathrm e^{-t_1}X_1\;. \end{eqnarray}

Solving the second equation for $X_1$ and substituting into the first equation yields

$$ X_0=1+\left(1-\left(t_0+\frac{t_0^2}2\right)\mathrm e^{-t_0}\right)X_0+t_0\mathrm e^{-t_0}\frac{1+\left(1-(1+t_1)\mathrm e^{-t_1}\right)X_0}{1-\mathrm e^{-t_1}}\;, $$

with solution

$$ X_0=\frac{2\left(t_0\mathrm e^{t_1}+\mathrm e^{t_0+t_1}-\mathrm e^{t_0}\right)}{t_0\left(t_0\mathrm e^{t_1}+2t_1-t_0\right)}\;. $$

As expected, this goes to $\infty$ for $t_0\to0$, for $t_0\to\infty$ and for $t_1\to0$ but not for $t_1\to\infty$ (since in that case we always reset after $C=1$ and try again to reach $C=N=2$ directly from $C=0$).

I doubt that the maximum with respect to $t_0$ and $t_1$ can be determined analytically. I solved the cases up to $N=5$ numerically (here’s the code); here are the results:

\begin{array}{c|cc} N&X_0&\hat X_0&t_0&t_1&t_2&t_3&t_4\\\hline 1&2.718282&2.718282&1\\ 2&3.321767&3.335387&1.896620&0.890660\\ 3&3.697442&3.737519&2.787170&1.795431&0.837793\\ 4&3.967924&4.038364&3.678863&2.695824&1.732412&0.804879\\ 5&4.177862&4.279544&4.573151&3.595684&2.631654&1.688119&0.781754 \end{array}

$t_i$ is the optimal waiting time after observing $C=i$, $X_0$ is the optimal expected number of observations, and $\hat X_0$ is the expected number of observations for the modal estimate $t_i=N-i$. Some remarks:

The estimate $t_i=N-i$ is quite good; even for $N=5$ it only takes about one tenth of an observation more than the optimum.
$X_0$ increases more slowly than I would have expected. For $N=5$ we only need about $1.5$ times as many observations as for $N=1$.
As you expected, the deviation from $t_i=N-i$ is downward and increases with $N-i$, both for fixed $i$ and for fixed $N$, but it is apparently not proportional to the square root; rather, for fixed $i$ it seems to be roughly linear in $N-i$, whereas for fixed $N$ it increases less rapidly than $\sqrt{N-i}$.

Optimal strategy for a game based on exponentially distributed wait times

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in OPTIMIZATION

Related Questions in GAME-THEORY

Trending Questions

Popular # Hahtags

Popular Questions