Distribution of maximum proportion of heads in an infinite sequence of flips

203 Views Asked by At

Let $X_1, X_2, X_3, \dots$ be independent variables each of which is $1$ with probability $1/2$ and $0$ otherwise. Let $$S_n=\sum_{i=1}^n X_i$$ and define $$Y = \sup_{n \geq 1} \frac{S_n}{n}$$

With probability $1$, "$\sup$" here can be replaced by $\max$, since $S_n/n$ is approaching $1/2$ by the Law of Large Numbers. In particular, $Y$ is rational with probability $1$. I am curious what is known and what has been studied about the distribution of $Y$. For example, are there closed forms for or bounds on

  • The expectation and variance of $Y$?
  • The probability that $Y=q$, for specific rational $q$ between $1/2$ and $1$?
  • The probability that $Y \leq \frac{1}{2}+\epsilon$, for small $\epsilon$?

I feel like this is something that has to be well studied, but I'm not sure the specific terms to look under for it.

3

There are 3 best solutions below

2
On BEST ANSWER

The second question, and in fact a version for general Bernoulli variables (i.e., also for biased coins), is completely solved, with a somewhat explicit formula of the distribution of $Y$, in the paper Wolfgang Stadje, The Maximum Average Gain in a Sequence of Bernoulli Games, The American Mathematical Monthly, Vol. 115, No. 10 (Dec., 2008), pp. 902-910.

The main result is for a sequence of independent variables $X_i'$ with $\mathbb{P}(X_i' = 1) = p \in (0,1)$ and $\mathbb{P}(X_i' = -1) = q = 1-p$, and corresponding $S_n' = \sum_{i=1}^n X_i'$ and $Y' = \sup \frac{S_n'}{n}$, which are obvious affine transformations of $X_i$, $S_n$, and $Y$ as defined in the question.

Results:

  1. $\mathbb{P}(Y' \in (p-q,1] \cap \mathbb{Q}) = 1$;
  2. $\mathbb{P}(Y' = x) > 0$ for every $x \in (p-q,1]$;
  3. $\mathbb{P}(Y' = 1) = p$;
  4. $\mathbb{P}(Y' = 0) = 0$ if $p \ge 1/2$;
  5. $\mathbb{P}(Y' = 0) = p(1-p/q)$ if $p<1/2$;

The description of the general distribution is a little more complicated, here it is: If $r/s \in (p-q,1)$ is a rational number in reduced form, with positive denominator, let $f(z) = pz^{2s} - z^{s+r} + q$. This is a polynomial of degree $2s$, so it has $2s$ complex zeros $z_0, \ldots, z_{2s-1}$. The paper shows that $s+r$ of the roots are in the closed unit disk, among those always $z=1$, and in the case that $s+r$ is even also $z = -1$, so we can assume $z_0=1$, $|z_i| \le 1$ for $1 \le i \le s+r-1$, and $|z_i|>1$ for $s+r \le i \le 2s-1$. If $s+r$ is even, we additionally assume that $z_1 = -1$. With this setup, the results are

  1. $\mathbb{P}(Y' \le r/s) = \prod\limits_{i=r+s}^{2s-1} (1-z_i^{-1})$;
  2. $\mathbb{P}(Y' = r/s) = \prod\limits_{i=r+s}^{2s-1} (1-z_i^{-1}) + p \prod\limits_{i=r+s}^{2s-1} (1-z_i)$

From (6) with $r=1$ and $s$ large, one should be able to get something about your third question, but I have not tried to see whether this is easy or hard.

0
On

I'm not sure all the terms you might want to search under, but I would include "random walk", since $P(Y\ge \frac{1}{2}+\epsilon)$ is just the probability of falling off the proverbial bridge if it is infinitely wide to the left and getting wider to the right along a diagonal with slope controlled by $\epsilon$.

1
On

Here is a picture of the distribution of $Y$ on $(\frac 12; 1)$ (I am not showing the top half which is just a vertical segment at $Y=1$)

enter image description here

This plot was made with all the fractions of denominator up to $200$. For $r = p/q$, there are $n=q-p$ roots to find, which you can apparently get with Newton's method starting at the $\exp((-1+2ik \pi)/n)$ for $k=0 \ldots n-1$. That everything lines up properly likes this should be evidence that this heuristic works.

Using the data from all those points gives $0.77 \le E[Y] \le 0.85$ (the gap corresponds to the missing probability mass, I realize now I could be a lot more precise here).

I think we can prove, looking at $r = 1- \frac 1n$ and the behaviour of the only small root $x_n$ of $1-2x+x^n$, that there is a horizontal tangent at $(1,\frac 12)$ (in fact it should converge exponentially to $\frac 12$).

To study the behaviour at $Y \approx \frac 12$, one would need to study the roots of $1 - 2x^n + x^{2n+1}$, which seems a bit harder .