What is the limiting probability distribution of the output of a particular die-rolling process as the number of rolls goes to $\infty$?

163 Views Asked by At

Suppose I have a die with six arithmetic operations--- $${-}2, {-}1, \times 0, +1, +2, +3$$ ---and that each roll of the die is uniformly distributed.

To any finite sequence of rolls of the die, assign the value given by successively applying the operations to a starting value of $0$. So, for example, for the $5$-term sequence $(+1,+3,{-}2,\times 0,-1)$, the value would be: $$((((0 + 1) + 3) - 2) \times 0) - 1 = -1 .$$ For any number $n$ of rolls, we can ask for the fraction $P_n(k)$ of $n$-roll sequences that have value $k$.

How can I find an explicit expression for the function $$f(k) := \lim_{n \to \infty} P_n(k)?$$

3

There are 3 best solutions below

0
On BEST ANSWER

This answer expands on A. Kriegman's and folds in some of my comments thereunder.

Let $P_n(k)$ denote the fraction of values of $n$-term sequences with value $k$, which we can interpret as the probability that the value of a uniformly randomly selected $n$-term sequence has value $k$.

The limiting probabilities $p_k := \lim_{n \to \infty} P_n(k)$ are stable under the application of a uniformly selected die roll, giving an infinite set of equalities: $$\begin{array}{rcll} p_k &=& \frac{1}{6}(p_{k - 3} + p_{k - 2} + p_{k - 1} + p_{k + 1} + p_{k + 2}), & k \neq 0 \\ p_0 &=& \frac{1}{6}(p_{- 3} + p_{- 2} + p_{- 1} + p_{1} + p_{2} + 1) . \\ \end{array}\qquad (\ast)$$

The first equation defines a linear recurrence with characteristic polynomial $$p(r) = r^5 + r^4 - 6 r^3 + r^2 + r + 1,$$ and so the half-infinite sequences $\{p_k\}_{k \leq 0}$ and $\{p_k\}_{k \geq 0}$ can be given as linear combinations of powers $\alpha_i^k$ of the roots $\alpha_i$ of $p$ (possibly with different coefficients for $k > 0$ and $k < 0$).

The roots of $p$ are : $$ \alpha = 0.82140\ldots, \quad \beta = -0.27496\ldots+i 0.38561 \ldots, \quad \bar\beta, \quad \gamma = 1.77912\ldots, \quad \delta = -3.05060\ldots . $$ Since $0 \leq p_k \leq 1$ for all $k$, the coefficients of $\gamma, \delta$ (whose real parts have absolute value $> 1$) must be zero for the sequence $\{p_k\}_{k \geq 0}$, and the coefficients of $\alpha, \beta, \bar\beta$ (whose real parts have absolute value $< 1$) must be zero for $\{p_k\}_{k \leq 0}$, and so $$\boxed{\begin{array}{rcll} p_k &=& A \alpha^k + B (\beta^k + \bar\beta^k), &k \geq 0 \\ p_k &=& C \gamma^k + D \delta^k , &k \leq 0 \end{array}\qquad (\ast\ast)}$$ for some constants $A, B, C, D$. (NB we can rewrite $\beta^k + \bar\beta^k$ as a manifestly real expression, namely, as $2 e^{\operatorname{Re}(\beta) k} \cos (\operatorname{Im}(\beta) k)$.) We can find those constants by producing an independent linear system in those variables and solving; one option is to substitute the expressions $(\ast\ast)$, $k = -1,0,1$ in $(\ast)$. We get one equation each from substituting the first and second equations in $(\ast\ast)$ in $(\ast)$, or we can replace one of those two equations with the condition $A + 2 B = C + D$ given by substituting $k = 0$ in both of the equations in $(\ast\ast)$.

Appealing to a C.A.S. produces explicit formulae for $A, B, C, D$ as rational polynomials in $\alpha, \beta, \gamma, \delta$, but the expressions are unwieldy (hundreds of thousands of characters among them), and it is not evident that they can be simplified further. Their numerical values are: $$\boxed{\begin{align*} A &= 0.13210\ldots\\ B &= 0.04359\ldots\\ C &= 0.15602\ldots\\ D &= 0.06328\ldots . \end{align*}}$$ In particular, $p_0 = 0.21930\ldots$.

Since $A, C \neq 0$, the limiting behaviors of $p_k$ are \begin{align*} p_k \sim A \alpha^k ,&\quad k \to \phantom{-}\infty \\ p_k \sim C \gamma^k ,&\quad k \to -\infty . \end{align*}

Remark One might ask whether we can produce exact expressions for the roots $\alpha, \beta, \ldots$ of the (quintic) polynomial $p$. If we restrict ourselves to algebraic expressions, we cannot: By reducing modulo $2$ we can efficiently deduce that $p$ is irreducible over $\Bbb Q$, so its Galois group contains a $5$-cycle. On the other hand, we've seen that $p$ has exactly $2$ nonreal roots, and hence the complex conjugation map is a transposition in the Galois group of $p$. But a transposition and a $5$-cycle generate all of $S_5$, which is hence the Galois group. In particular, it is not solvable, so the roots $\alpha, \beta, \ldots$ are not expressible in terms of radicals.

5
On

You can model this as a Markov chain and there are known techniques for how to solve these problems. I'll explain how we can solve this example.

Let $p_n$ be the probability of having a total of $n$ after a large number of rolls. If we've gone on long enough then we should expect these probabilities to not change after our next roll. So, $$p_n = \frac{1}{6}(p_{n-3} + p_{n-2} + p_{n-1} + p_{n+1} + p_{n+2})$$ except when $n=0$, in which case $$p_0 = \frac{1}{6}(p_{-3} + p_{-2} + p_{-1} + p_{1} + p_{2} + 1)$$. That extra one represents the chance that we could come back to 0 from any number.

Note that if we didn't have the set 0 roll, then this technique wouldn't work because the solution would be that all the $p_n$s are equal, but this is impossible because there are infinitely many of them and they sum to 1. In that case we would have to limit ourselves to questions like what happens after $t$ throws instead of what happens asymptotically. I believe in this case we can solve this system, but I'm not exactly sure how off the top of my head.

Once we do solve this system, the solution is called the stationary distribution because it stays stationary after the next roll. There is a handy theorem that for any Markov chain that has a stationary distribution, it will approach the stationary distribution given enough time. I'm not sure the exact statement, but I believe it holds in this case. So all you have to do is solve that infinite system of equations.

7
On

The limit that you hit any number $k$, positive or negative, goes to $1$ as $n \to \infty$. Say we want the chance of hitting $k=27$. This is higher than the chance we get a $0$ and then $9\ +3$'s in a row because there are other ways to get to $27$, but the chance you get that string in $n$ throws is $1-6^{9-n}$. This goes to $1$ as $n \to \infty$. The reset to $0$ allows us to overcome the upward bias of $\frac 12$ per throw.