The probability of observing a value as extreme, or more extreme, than the sample maximum.

682 Views Asked by At

Consider a random sample of continuous random variables $X_1, X_2, ..., X_n$ with CDF $F$. Define the sample maximum:

$$ M = \max(X_1, X_2, ..., X_n) $$

Some simulations I've done (see the R code below) seem to indicate that the probability of observing a value as extreme, or more extreme, than the sample maximum is approximately $1/n$ (is this true?).

Specifically the simulations indicate that

$$ E \big(1 - F(M) \big) \approx \frac{1}{n} $$

The fact that, for example, the maximum from a sample of size 1000 should basically be a 1 in 1000 shot agrees with my intuition but I'm having trouble deriving this mathematically.

Any advice on this is appreciated.

set.seed(15435)
z = rep(0,100000)
for(i in 1:100000) z[i] = 1-pexp(max(rexp(1000)))
1/mean(z)
[1] 1004.507

z = rep(0,100000)
for(i in 1:100000) z[i] = 1-pnorm(max(rnorm(1000)))
1/mean(z)
[1] 996.892

z = rep(0,100000)
for(i in 1:100000) z[i] = 1-punif(max(runif(1000)))
1/mean(z)
[1] 1000.024

z = rep(0,100000)
for(i in 1:100000) z[i] = 1-pbeta(max(rbeta(1000,3,5)),3,5)
1/mean(z)
[1] 999.0542
1

There are 1 best solutions below

1
On BEST ANSWER

You assume also, I guess, that the random variables are independent. So you assume further that the rv's have a continuous distribution, therefore the maximum value is unique with probability 1. So the problem is symmetric in the variables, i.e. for each $1\leq k \leq n$ the probability that $$ P(X_k=\max(X_1,\ldots,X_n))=c $$ with $c$ constant. Since the probabilities sum up to unity, you get immediately $c=1/n$. Now this means further if you have $n$ values from $X_1,\ldots,X_n$ and you get a further value $X_{n+1}$ the probability that it is larger than all values before is $1/(n+1)$.

The field of probability theory concerned with such questions is called Extreme Value Theory. You can look it up in wikipedia or if you have more interest, start with a classic as Gumbel's Statistics of Extremes.