Behavior of sample mean when removing lower 10% and resampling

Question

Behavior of sample mean when removing lower 10% and resampling

43 Views Asked by Bumbble Comm At 06 Apr 2026 - 12:42

its been years since i've done any prob/stat so I don't have the tools right now to tackle this one. I'm hoping you can help me out with something I've been mulling over recently after a discussion with some HR folk. They were talking about performance measures and it got me thinking about a resulting kind of random process. Apologies if this has been asked already.

Suppose $$X_{i}(t) \sim N(\nu_{i}, 1)$$ where $\nu_{i}$ is a realization of $$\nu \sim N(\mu,1)$$

Note, i've assumed both the population and individual variances are 1 for simplicity of asking the question. Suppose you select $n$ individuals from the population so that:$$X_{i} \ \ i=1,..,n$$ so at any time,$$X_{i}(t) \sim N(\nu_{i},1)$$

Suppose at time $t=0$ we observe each $X_{i}(0)$ and compute the mean $$\frac{X_{1}(0)+...+X_{n}(0)}{n}=\alpha_{0}$$

after which the individuals whose values fell in the bottom $\rho=10\%$ of values are removed, and replaced with new individuals, so that $n$ remains constant. Continue this process for $t=0,1,2,3,....,$ each time calculating $\alpha_{t}$.

1) Can anything be said about $\lim_{t \to \infty} \alpha_{t}$?

I would think that for small $\rho$ we'd see something like $\lim_{t \to \infty} \alpha_{t}=\infty\ \ a.s.$ and that for large $\rho$ $\lim_{t \to \infty} \alpha_{t} $ would converge in distribution to something with mean $\mu$

2) How does the answer to 1) change if we do the following instead. We divide $[0,1]$ into $k$ subintervals and observe $X_{i}(t)$ for $t=\frac{1}{k},\frac{2}{k}...,1$. We then compute the observed mean for each individual $\overline{X_{i}}$ and calculate: $$\frac{\overline{X_{1}}+...+\overline{X_{n}}}{n}=\alpha_{0}$$

after which the individuals whose means fell in the bottom $\rho=10\%$ of values are removed, and replaced with new individuals, so that $n$ remains constant. Continue this process for $\tilde{t}=0,1,2,3,....,$ each time calculating $\alpha_{\tilde{t}}$.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

If you are keeping the largest values while resampling the smaller values, the expected value of the sample mean will eventually become arbitrarily large. This is because for any $K > 0$ you will almost surely eventually see values which exceed $K$. Such values will eventually be replaced by even larger values, but never smaller values. The newer values will with very high probability small values which are unable to compensate for those retained large values.

Here is some R code to play around with these ideas:

#in the following, f is a random number generator
#where f(k) returns a vector of k numbers

update.sample <- function(v,f,cutoff){
  n <- length(v)
  keep = v[v > quantile(v,cutoff)]
  m = n - length(keep)
  c(f(m),keep) #new sample
}

sim <- function(n,f,cutoff,steps){
  alphas <- numeric(steps)
  samp <- f(n)
  for(i in 1:steps){
    alphas[i] <- mean(samp)
    samp <- update.sample(samp,f,cutoff)
  }
  alphas
}

plot(1:1000,sim(100,rnorm,0.5,1000))

Output:

Even though in principle you will eventually get arbitrarily large values, you won't see them by running this code. The expected waiting time before seeing e.g. a standard normal variable which exceeds 100 is ridiculously large.

Behavior of sample mean when removing lower 10% and resampling

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in STOCHASTIC-PROCESSES

Trending Questions

Popular # Hahtags

Popular Questions