Variance of Order Statistics

125 Views Asked by At

I have a question about bounding the variance of order statistics.

Given that for $i \in \{1,\cdots,\lambda\}$, denote $Bin(s,\frac{1}{n})$ to be a binomial random variable with success probability $\frac{1}{n}$ and $s$ independent experiments. Define $W_i=Bin(n-s,\frac{1}{n})-Bin(s,\frac{1}{n})$, where $n \in \mathbb{N}$ is some constant w.r.t s and $\lambda$. All $W_i$ are i.i.d random variables.

Is there any estimate that I can use to upper bound the variance of the largest order statistics $Z= \max_{i \in [\lambda]}\{W_i\}$, i.e. $Var(Z)\leq U$?

I am aware that we can use $\max_{i \in [\lambda]}\{W_i\}\leq \sum W_i$ to bound the variance $Var(Z)\leq \sum Var(W_i)=\lambda * n \frac{1}{n}(1-\frac{1}{n})= \lambda(1-\frac{1}{n})$.

But I wonder whether we can derive any tighter upper bound in this case. So we can bound the variance within some lower order term in $\lambda$ or even constant $O(1)$. Thank you!

1

There are 1 best solutions below

3
On

I cannot give an explicit bound, but I would have thought that, since the $W_i$s themselves are independent with their distribution able to take values on the integers in $[-s,n-s]$, the variance of their maximum would eventually be decreasing toward $0$ as $\lambda \to \infty$, since the distribution of the maximum would eventually concentrate near $n-s$.

I would have guessed that the variance of the maximum was bounded by the case of $\lambda=1$ which, as you say, is $1-\frac1n$, which in turn is bounded above by $1$.

Here is a simulation using R, with $n=10$, $s=7$, and $\lambda$ taking values from $1$ to $150$. Unexpectedly, even allowing for simulation noise, it suggests that the variance of the maximum is not a decreasing function of $\lambda$ everywhere.

maxW <- function(n, s, lambda){
  W <- rbinom(lambda, n-s, 1/n) - rbinom(lambda, s, 1/n)
  return(max(W))
  }

n <- 10
s <- 7
maxlambda <- 150
cases <- 10^4
simvar <- numeric(maxlambda)

set.seed(2023)
for (lambda in 1:maxlambda){
  sims <- replicate(cases, maxW(n, s, lambda))
  simvar[lambda] <- var(sims)
  }
plot(1:maxlambda, simvar, ylim=c(0,1), xlab="lambda")
abline(h=0, col="black") 
abline(h=1-1/n, col="red") 
 

enter image description here