Number of samples needed before sample maximum greater than some value

Question

Number of samples needed before sample maximum greater than some value

497 Views Asked by Bumbble Comm At 02 Apr 2026 - 11:20

Let's say you have a standard normal distribution, and you are sampling from this $N$ times. How many samples will it take before the maximum observed value will be at least 3 (or in general some value $K$)?

To solve this problem I considered the CDF of the normal distribution for when $x >= 3 $. This gives that the probability of finding a value of $x >= 3$ in one sample is $0.0013499$. Since we know all our samples are independent of each other, the answer would appear to be the mean of the geometric distribution that results with $p=0.0013499$, which is $740.97$.

However, by simulating a large number of trials I found that the true answer is around 444 trials. (Here's the mathematica code to show this Table[Table[RandomVariate[NormalDistribution[]], {x, 1, 444}] // Max, {k, 1, 1000}] // Mean

This can also be verified mathematically by solving the reverse problem: the expected sample maximum from $N$ trials. Note that $[Pr(x <= K)]^{444}$ — the probability that the results from all 444 trials are less than k — constitutes a CDF for all 444 trials. From this the corresponding PDF (albeit in terms of Erf function) can be found by differentiating, and finding the expected value of this PDF (or letting mathematica approximate the integral numerically) indeed gives that 444 trials is sufficient to have an expected sample maximum of 3.

So why did my attempt to solve the problem overshoot the answer?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 02 Aug 2017 - 5:24

For given $p=Q(3)$ value $E[T]=1\times p+ (1+E[T])\times (1-p)=740.7967$ (approximately) as you have indicated, Simply your simulation must be incorrect, here is my very layman piece of Matlab code

N=100000;
res=zeros(1,N);
parfor (ii=1:N)
k=1; 
while(randn(1,1)<=3)
k=k+1;
end
res(ii)=k;
end
mean(res)

For my sample run it gave 742.5393

**Bumbble Comm** · Accepted Answer

If I read your post correctly (but beware that I checked none of the numerical values involved), you are successively solving two different problems.

In both cases, one is given a sequence $(X_n)_{n\geqslant1}$ i.i.d. standard normal and one considers its running maximum defined for every $n\geqslant1$ as $M_n=\max\{X_k\mid1\leqslant k\leqslant n\}$.

Approach "741": Let $\theta_3=E(T_3)$ where $T_3=\inf\{n\geqslant1\mid X_n\geqslant3\}$, then $\theta_3=P(X_1\geqslant3)^{-1}$ and you say that $\theta_3\approx741$.

Approach "444": Let $\mu_3=\inf\{n\geqslant1\mid E(M_n)\geqslant3\}$, then you say that $\mu_3\approx444$.

Since $T_3$ is also $T_3=\inf\{n\geqslant1\mid M_n\geqslant3\}$, one is considering either $$E(\inf\{n\geqslant1\mid M_n\geqslant3\})$$ or $$\inf\{n\geqslant1\mid E(M_n)\geqslant3\}$$ which need not coincide.

Number of samples needed before sample maximum greater than some value

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in DENSITY-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions