If $X_1, \ldots, X_n \sim N(0,1)$, which each of them independent, then a common result from probability theory is that:
$$ E\left(\max_{1 \leq i \leq n}X_i\right) = O\left(\sqrt{\log\ n}\right) $$
The notation on the left is big-Oh notation. Now, I read in a paper that this signifies that while the normal distribution can be good for fitting data, it is hard for the normal distribution to predict extreme events. However, I am not sure why the above result necessarily shows this. Is it because the term $\sqrt{\log\ n}$ increases very slowly and hence if we have a very large $X_i$, we would need a large number of samples before we can predict (via the expectation) largely? Thanks.