Predict maximum of set of normally distributed values

118 Views Asked by At

Given a set of size n of normally distributed values, how can I predict the maximum value of the set (assuming the set contains about 50% positive and 50% negative values)? And how can I calculate the 95% confidence interval for where that maximum would fall?

1

There are 1 best solutions below

0
On BEST ANSWER

The cumulative distribution function of the maximum is $F(x)^n$, where $F(x)$ is the cumulative distribution function of the individual distributions. Thus its density is $nF(x)^{n-1}f(x)$. For large $n$, this is strongly concentrated near $F(x)=1-\epsilon$ and well approximated by $n\exp\left(-n(1-F(x))\right)f(x)$. For a standard normal distribution, we have

$$ F(x)=\frac12\left(1+\operatorname{erf}\left(\frac x{\sqrt2}\right)\right)\approx1-\frac{\mathrm e^{-\frac12x^2}}{\sqrt{2\pi}x} $$

and

$$f(x)=\frac{\mathrm e^{-\frac12x^2}}{\sqrt{2\pi}}$$

and thus

$$ nF(x)^{n-1}f(x)\approx\frac n{\sqrt{2\pi}}\exp\left(-n\frac{\mathrm e^{-\frac12x^2}}{\sqrt{2\pi}x}-\frac12x^2\right)\;. $$

Setting the derivative of the exponent to $0$ yields a transcendental equation for $x$, and the leading term of the solution is determined by $n\mathrm e^{-\frac12x^2}\sim1$ and thus $x\sim\sqrt{2\ln n}$. Thus, for a general normal distribution, for large $n$ the distribution of the maximum is peaked around $\mu+\sqrt{2\ln n}\,\sigma$.