Finding Probability that Maximum Value of $N + 1$ Random Variables is Larger than a Given Limit

224 Views Asked by At

For positive random variables $X_{0}, X_{1}, X_{2}, \ldots, X_{N}$, where $N$ is also a random variable, we know that $X_{1}, X_{2}, \ldots, X_{N - 1}$ are I.I.D. with continuous PDF $f(x)$ and $X_{0} + X_{1} + X_{2} + \ldots + X_{N} = 1$. What is $P(\max_{i = 0} ^ {N} X_{i} \le d)$, for a given $d$?

If an exact solution is not possible, can one find bounds for this probability? Can this problem be solved if $f(x)$ is some well-known distribution (say exponential, log-normal, etc.)?

EDIT. An equivalent problem is this: a number of points lie in the interval $[0, 1]$. If we know that the length of segments between each two consecutive points are I.I.D random variables with PDF $f(x)$ (this does NOT include the distance between $0$ and the first point, or $1$ and the last point), what is the probability that there is no opening larger than $d$ in this interval? $d$ is a given and fixed parameter.

EDIT2. A related problem may be this: for a 1-D point process spanning $\mathbb{R}$ where interarrival times are I.I.D. random variables with PDF $f(x)$, what is the probability of observing an opening bigger that a given value $d$ in the interval $[0, 1]$? The spaces between $0$ and the first point falling in the interval and $1$ and the last point falling in the interval count as well. For example, if for some realization there is only one point in the interval $[0, 1]$ at $\frac{1}{2}$, there are two openings in this interval and both have a length of $0.5$. If for some other realization there are no points in the interval $[0, 1]$, there is only one opening in this interval and it has a length of $1$.

Hope this edit clears up the confusion and does not add to it.

1

There are 1 best solutions below

7
On

The problem is underspecified by a lot... We know almost nothing about $X_0, X_N$ and $N$ except that the three of them together contrive to make $\sum_{j=0}^N X_j = 1$.

For shorthand, write $M = \max_{j=0}^N X_j.$

Obviously $M \le 1$, so $\forall d \ge 1: P(M \le d) = 1$. This is trivial and from now on we only consider $d < 1$.

Claim A: No non-trivial (i.e. $> 0$) lower bound is possible. I.e. there exists an example where $\forall d < 1: P(M \le d) = 0$.

Example A: $N \equiv 0, X_0 \equiv X_N \equiv 1$. In this example $M \equiv 1$, so $\forall d < 1: P(M \le d) = 0$.

Claim B: If $d \ge \frac12$, no non-trivial (i.e. $<1$) upper bound is possible. I.e. there exists an example where $\forall d \ge \frac12: P(M \le d) = 1$.

Example B: $N \equiv 1, X_0 \equiv X_N \equiv \frac12$. In this case $M \equiv \frac12$, so $\forall d \ge \frac12: P(M \le d) = 1$.

Claim C: If $d < \frac12$, a possibly non-trivial upper bound is $P(M \le d) \le P(X_1 \le d) = F(d)$ where $F()$ is the CDF of $X_1$.

Proof: We are free to pick $X_0, X_N$ as high as desired subject to each being $\le d$, but this leaves a "gap" of at least $1-2d > 0$, which must be filled by at least $X_1$. For $M\le d$ to happen requires $X_1 \le d$. Thus $P(M \le d) \le P(X_1 \le d) = F(d)$.

Claim D: (Stronger version of Claim C) If $d < \frac12$, a possibly non-trivial upper bound is $P(M \le d) \le F(d)^K$ where $K = \lceil {1-2d \over d} \rceil = \lceil {\frac1d - 2} \rceil$.

Proof: Refining the above proof, if a gap of size at least $1-2d$ is to be filled by one or more $X_j$'s, and each $X_j \le d$, then the number of such $X_j$'s must be $\ge K = \lceil {1-2d \over d} \rceil$. For $M \le d$ to happen therefore requires at least all of $X_1, X_2, \dots, X_K \le d$. Since they are I.I.D., $P(\text{all } X_1, \dots, X_k \le d) = F(d)^K$.

Claim E: For any particular value of $d$ (where $d < \frac12$), the bound in Claim D is tight. I.e. there exists an example (dependent on $d$) where $P(M \le d) = F(d)^K$.

Example E: $X_0 = d, N = K+1, X_N = 1 - (K+1)d,$ and $X_j$ is bi-valued (biased coin flip) and takes value $d$ with probability $F(d)$ and some other value $c > d$ with probability $1 - F(d)$. Then $M \le d$ happens iff you get $K$ coin flips of value $d$ each, i.e. $P(M \le d) = F(d)^K$.


I imagine better bounds are possible if we have more info, e.g. to eliminate/disallow Examples A and B. Or even just in Claim D/E, a better bound might be possible if we know the behavior of $F()$ at small values, to eliminate Example E.