How likely are extreme observations in a probability distribution?

209 Views Asked by At

Given a measurement that follows a probability distribution (for the sake of argument, Gaussian) how likely is it that repeated observations on the distribution are an extreme of low or high?

I realise that the first two will be by definition, but how quickly does the probability reduce by the 10th, 20th, 50th etc observation?

How quickly does the cumulative probability go up when taking multiple observations on different distributions?

Background
I am having a hard time forming my question (edits appreciated), so I will give some background.

I was having a conversation with a friend about the weather when he remarked that it had been the hottest September in 20 years(1). I said that he shouldn't be surprised, and in fact given the number of different weather measurements (hottest, coldest, wettest, driest, most sunny, least sunny, etc) and given the number of times the observations are made (weekly, monthly, annually) then I thought it quite normal, and indeed likely, to get some sort of extreme observation.

I realise there is no exact answer to this question; I am not asking what is the probability is of it being the hottest September in 20 years.

(1) After some research it is apparent that weather recordings don't follow Gaussian distributions, but its still an interesting topic of thought.

3

There are 3 best solutions below

0
On

The probability that this September is the hottest September in the last $n$ years is $\frac1{n}$ if we assume that there is no time-related trend and exact equality is extremely unlikely.

0
On

If $X_1, \ldots, X_n$ are exchangeable random variables and $\mathbb P(X_i = X_j) = 0$ for all $i,j$, then $$\mathbb P(X_n = \max(X_1,\ldots,X_n)) = \dfrac{1}{n}$$ and (if $n \ge 2$) $$\mathbb P(X_n = \max(X_1,\ldots,X_n) \ \text{or}\ X_n = \min(X_1,\ldots,X_n)) = \dfrac{2}{n}$$

0
On

If you haven't made any observations yet and you plan to make $n$ observations, then the probability that any given observation is the high (or the low) is $\frac1n$, assuming that there can't be ties. However, if you've made some observations already, then the probability of a high (or low) on the next observation depends on what you've already seen.