its been years since i've done any prob/stat so I don't have the tools right now to tackle this one. I'm hoping you can help me out with something I've been mulling over recently after a discussion with some HR folk. They were talking about performance measures and it got me thinking about a resulting kind of random process. Apologies if this has been asked already.
Suppose $$X_{i}(t) \sim N(\nu_{i}, 1)$$ where $\nu_{i}$ is a realization of $$\nu \sim N(\mu,1)$$
Note, i've assumed both the population and individual variances are 1 for simplicity of asking the question. Suppose you select $n$ individuals from the population so that:$$X_{i} \ \ i=1,..,n$$ so at any time,$$X_{i}(t) \sim N(\nu_{i},1)$$
Suppose at time $t=0$ we observe each $X_{i}(0)$ and compute the mean $$\frac{X_{1}(0)+...+X_{n}(0)}{n}=\alpha_{0}$$
after which the individuals whose values fell in the bottom $\rho=10\%$ of values are removed, and replaced with new individuals, so that $n$ remains constant. Continue this process for $t=0,1,2,3,....,$ each time calculating $\alpha_{t}$.
1) Can anything be said about $\lim_{t \to \infty} \alpha_{t}$?
I would think that for small $\rho$ we'd see something like $\lim_{t \to \infty} \alpha_{t}=\infty\ \ a.s.$ and that for large $\rho$ $\lim_{t \to \infty} \alpha_{t} $ would converge in distribution to something with mean $\mu$
2) How does the answer to 1) change if we do the following instead. We divide $[0,1]$ into $k$ subintervals and observe $X_{i}(t)$ for $t=\frac{1}{k},\frac{2}{k}...,1$. We then compute the observed mean for each individual $\overline{X_{i}}$ and calculate: $$\frac{\overline{X_{1}}+...+\overline{X_{n}}}{n}=\alpha_{0}$$
after which the individuals whose means fell in the bottom $\rho=10\%$ of values are removed, and replaced with new individuals, so that $n$ remains constant. Continue this process for $\tilde{t}=0,1,2,3,....,$ each time calculating $\alpha_{\tilde{t}}$.
If you are keeping the largest values while resampling the smaller values, the expected value of the sample mean will eventually become arbitrarily large. This is because for any $K > 0$ you will almost surely eventually see values which exceed $K$. Such values will eventually be replaced by even larger values, but never smaller values. The newer values will with very high probability small values which are unable to compensate for those retained large values.
Here is some R code to play around with these ideas:
Output:
Even though in principle you will eventually get arbitrarily large values, you won't see them by running this code. The expected waiting time before seeing e.g. a standard normal variable which exceeds 100 is ridiculously large.