Rescaling to obtain Extreme Value Statistics in the Limit of Infinitely Many Random Variables

20 Views Asked by At

Assuming I have $M$ i.i.d. random variables $X_1,...,X_M$ with density $P(x) = A\exp(-B \left| x \right|^{\delta})$ and I'm interested in the behavior of the smallest, $X_{\text{min}}$, I clearly have $$P(X_{\text{min}} > x) = P(X_1 > x)^M.$$ Furthermore for $x^{\ast}(M)$ such that $P(X_1 < x^{\ast}(M)) = 1/M$, I find that, for $M \rightarrow \infty$, $$\left|x^{\ast}(M)\right|^{\delta} = \log(M)/B + O(\log\log(M)).$$ Now I was told that using this to obtain extreme value statistics for, $M \rightarrow \infty$, I should focus on values close to $x^{\ast}(M)$, i.e. write $x = x^{\ast}(M) + \epsilon/(B\delta \left| x^{\ast}(M) \right|^{\delta-1})$ for $\epsilon$ fixed, which gives $$P(X_i > x) = 1 - \frac{1}{M}e^{\epsilon}(1 + o(1)).$$

I now have the following questions

  1. Why do I focus on this value $x^{\ast}(M)$ and values close to it to obtain extreme value statistics?
  2. Why is the rescaling $x = x^{\ast}(M) + \epsilon/(B\delta \left| x^{\ast}(M) \right|^{\delta-1})$ the correct one?
  3. Why is this both intuitively and formally correct?
  4. How well does this kind of reasoning generalize?