Distribution at First Time a Sum Reaches a Threshold

428 Views Asked by At

Consider the following problem. Roll a die many times, and stop when the total exceeds $M$, for some prescribed threshold $M$. Call this time $\tau$, and call the running score after $n$ rolls $X_n$.

What is the distribution of $X_\tau$?

Of course $X_\tau \in [M,M+5]$. However, I can get little beyond this. Moreover, for small $M$ the distribution should be very sensitive, and I doubt have a nice form. However, if I define $X'_\tau = X_\tau - M$, then it seems to me that $X'_\tau$ should have a limiting distribution at $M \to \infty$.


Extra comments. $\quad$ Note that there are various related questions here on maths.SE, but these (as far as I have seen) are about determining $\tau$, in particular its expectation, $E(\tau)$.

Also, one can solve for $E(\tau)$ directly. If I write $k_M$ for $E(\tau)$ with threshold $M$, then $$ \textstyle k_M = \tfrac16 \sum_{j=1}^6 k_{M-j} + 1, $$ and writing $\ell_M = k_M - k_{M-1}$ we get $$ \textstyle \ell_M = \tfrac16 \sum_{j=1}^6 \ell_{M-j}. $$ In principle, by trying the solution $\ell_r = r^\lambda$ and solving $$ 6 \lambda^6 - \lambda^5 - \lambda^4 - \lambda^3 - \lambda^2 - \lambda - 1 = 0, $$ (see this WolframAlpha computation), and calculating some initial conditions by hand, then one can find $k_M = E(\tau)$. Note that this would involve finding six initial conditions and solving a set of six simultaneous equations. This is only for the $\ell$-s; one then needs to convert this into the $k$-s. This does not sound like fun to me! =P

By a simple martingale argument---$(X_n - \tfrac72n)_{n\ge0}$ is a martingale, and $\tau$ is a deterministically bounded stopping time---from this one immediately gets $E(X_\tau) = \tfrac27 E(\tau)$ (for any $M$).

2

There are 2 best solutions below

6
On BEST ANSWER

Let $p^M_i$ be the probability that $X_{\tau(M)}=M+i$ for $i=0,1,\dots,5$, and $p^M$ denote the column vector of these six values.

You can compute $p_M$ from $p_{M-1}$ as follows. There are two ways to achieve a final value of $M+i$; either the first time that you value at or above $M-1$ you reach is $(M-1)+(i+1)$, or the first value at or above $M-1$ you reach is $M-1$, and then from there you jump immediately to $M+i$. Therefore, $$ p_i^M=\begin{cases}p^{M-1}_{i+1}+\frac16 p^{M-1}_0 & i<5\\\frac16p_0^{M-1} & i=5\end{cases} $$ This can be written as a matrix equation: $$ p^M=\begin{bmatrix} \frac16 & 1 & \\ \frac16 &0 & 1 &\\ \frac16 & 0&0 & 1 &\\ \frac16 & 0&0&0& 1 &\\ \frac16 & 0&0&0&0& 1 \\ \frac16 & 0&0 &0&0&0\\ \end{bmatrix}p^{M-1} $$ with zeroes above the super-diagonal. Letting $A$ be the above matrix, then this proves $$p^M=A^Mp^0,$$ where $p^0$ is a vector whose first entry is $1$ and whose other entries are zero.

The limiting distribution $p$ will satisfy $p=Ap$. This means that $p_i=p_{i+1}+\frac16p_0$, so that $p$ is an arithmetic progression with difference $-\frac16p_0$. A little thought shows that this implies $$ p=\left(\frac{6}{21},\frac{5}{21},\frac{4}{21},\frac{3}{21},\frac{2}{21},\frac{1}{21}\right)^T. $$

5
On

Inspired by the answer from @MikeEarnest, I wonder if the following alternate proof is valid, for the limiting case? This proof has the advantage(?) of less algebra, and hopefully more intuition into why the distribution is 6:5:4:3:2:1.

Imagine you keep rolling forever. A number is reached if it is the sum at some point in time, otherwise it is skipped. Clearly in the limit, all numbers have the same probability $q$ of being reached. (This follows from ergodicity, right? In fact, ergodicity would suggest $q = {1 \over 3.5}$, but we don't need its exactly value for now.)

Now consider the interval of interest, $X_\tau \in [M, M+5]$:

  • $X_\tau = M+5$ iff $M-1$ is reached and the next roll is $6$. This happens with probability $q/6$.

  • $X_\tau = M+4$ iff (a) $M-1$ is reached and the next roll is $5$, or, (b) $M-2$ is reached and the next roll is $6$. So this happens with probability $2q/6$.

    • Note that the case of reaching $M-2$, then rolling $1$ to reach $M-1$, then rolling $5$ to reach $M+4$, is included in (a) but not in (b), so we did not double-count.
  • Similarly, $X_\tau = M+3, M+2, M+1, M$ with probabilities $3q/6, 4q/6, 5q/6, 6q/6$ respectively.

Since these 6 possibilities are exhaustive, we have $$(1+2+3+4+5+6)q/6 = 1 \implies q = {6 \over 21} = {1 \over 3.5}$$ as I originally suspected; in particular, this implies that $$P(X_\tau = M + j) = (6-j)q/6 = (6-j)/21,$$ agreeing with Mike's answer.