Stoppage time for sequence of uniform random numbers with a recursively shrinking domain

61 Views Asked by At

Define $x_n = U(x_{n-1})$ where $U(x)\in\lbrace 0,1,\ldots,x\rbrace$ is a uniformly distributed random integer. Given $x_0$ as some large positive integer, what is the expected value of $n$ for which $x_n=0$?

The answer I came up with uses iterated conditioning, writing $y_n=\mathbb{E}[x_n]$, $y_n = \mathbb{E}[\mathbb{E}[U ( x_{n-1} ) | x_{n-1}]] = \mathbb{E}[\mathbb{E}[x_{n-1}/2 | x_{n-1}]] = \frac{1}{2}y_{n-1}$ with $y_0=x_0$ so $y_n=x_0/2^n$. To me, the question can then be recast as finding the value of $n$ for which $y_n < 1/2$, i.e. $$n = \left\lceil \frac{\ln x_0}{\ln 2}\right\rceil+1$$ But I tried simulating this in R for some values of $x_0$ and this answer seems to consistently overestimate the simulated result, have I made a blunder in my reasoning?

For reference, here is my R code:

simStopTime <- function(x1) {
  i <- 1
  x <- x1
  while (x[i] > 0 ) { 
    i <- i+1
    x[i] <- sample(0:x[i-1], 1)  
  }
  return(i - 1)  # Subtract 1 because indexing in R starts at 1
}

samples <- replicate(5000, simStopTime(1000))
mean(samples) # Fluctuates around 8.47 on repeated runs
log(1000)/log(2) + 1 #Gives 10.96578
1

There are 1 best solutions below

3
On BEST ANSWER

The expected waiting time $T_k$ to get down from $k$ to $0$ is $T_0 = 0$ for the base case, and otherwise it is $T_k = 1 + H_k$, where $H_k$ is the $k$th harmonic number $1 + 1/2 + 1/3 + \cdots + 1/k$. For large $k$ this is approximately $1 + \gamma + \ln k$, where $\gamma \doteq 0.57722$ is the Euler-Mascheroni constant.

By inspection $T_0 = 0$. For $k > 0$, we observer that it takes one step to get to a number that is uniformly distributed in the interval $[0, k]$, and the expected time can therefore be written recursively:

$$ T_k = 1 + \frac{1}{k+1} (T_0 + T_1 + T_2 + \cdots + T_k) $$

Recognize that $T_0 = 0$ and multiply both sides by $k+1$:

$$ (k+1)T_k = k+1 + T_1 + T_2 + \cdots + T_k $$

Subtract $T_k$ from both sides:

$$ kT_k = k+1 + T_1 + T_2 + \cdots + T_{k-1} $$

In particular

$$ T_1 = 1+1 = 2 $$

We now proceed by induction. Suppose that we know already that $T_i = 1+H_i$ for $1 \leq i \leq k-1$. We can then write

$$ \begin{align} kT_k & = k+1 + (1+H_1) + (1+H_2) + \cdots + (1+H_{k-1}) \\ & = k+1 + (k-1) + H_1 + H_2 + \cdots + H_{k-1} \\ & = 2k + (1) + (1 + 1/2) + \cdots + (1 + 1/2 + 1/3 + \cdots + 1/(k-1)) \end{align} $$

After the $2k+1$ in the last line, we have $k-1$ terms of $1$, $k-2$ terms of $1/2$, $k-3$ terms of $1/3$, and so on, until $1$ term of $1/(k-1)$. We can therefore write

$$ \begin{align} kT_k & = 2k + \frac{k-1}{1} + \frac{k-2}{2} + \frac{k-3}{3} + \cdots + \frac{1}{k-1} \\ & = 2k + \Bigl(\frac{k}{1}-1\Bigr) + \Bigl(\frac{k}{2}-1\Bigr) + \cdots + \Bigl(\frac{k}{k-1}-1\Bigr) \\ & = 2k - (k-1) + \frac{k}{1} + \frac{k}{2} + \frac{k}{3} + \cdots + \frac{k}{k-1} \\ & = k+1 + kH_{k-1} \end{align} $$

Divide both sides by $k$ to get:

$$ T_k = 1 + 1/k + H_{k-1} = 1+H_k $$

For $k = 1000$, we have $T_{1000} = 1+H_{1000} \doteq 1 + 0.5772 + 6.9078 = 8.4850$. (A symbolic math package gives us more directly the value $T_{1000} = 1+H_{1000} \doteq 8.48547$.)