Empirical CDF Properties

179 Views Asked by At

Suppose $X_i$ are iid $N(0,1)$ random variables and define $\hat F(x) = n^{-1} \sum_{i=1}^n \mathbf{1}(X_i \leq x)$.

I want to compute the limiting distribution of $$\sqrt{n} \left(\hat F(\frac{1}{\sqrt{n}}) - \frac{1}{2}\right)$$

I know the limiting distribution of $\sqrt{n} \left(\hat F(0) - \frac{1}{2}\right)$ is $N(0,1/2)$ by the central limit theorem and that $\hat F(1/\sqrt{n}) - \hat F(0) \xrightarrow{P} 0$ (since both of the $\hat F$ terms tend to $1/2$ in probability), but I can't decipher this because it would suffice to show $\sqrt{n}(\hat F(1/\sqrt{n}) - \hat F(0)) \xrightarrow{P} 0$. Any ideas?

1

There are 1 best solutions below

6
On

For large $n$, I would think you may have the following as slightly abusive approximations, though the last is meaningful as convergence in distribution

  • $P\left(Xi \le \frac{1}{\sqrt{n}} \right) \approx \frac12 + \frac1{\sqrt{2 \pi n}}$
  • $\sum_{i=1}^n \mathbf{1}\left(X_i \leq \frac{1}{\sqrt{n}}\right) \sim Bin\left(n, \frac12 + \frac1{\sqrt{2 \pi n}}\right)$
  • $\sum_{i=1}^n \mathbf{1}(X_i \leq x) \sim N\left(\frac n2 + \frac n{\sqrt{2 \pi n}}, \frac n4\right)$ where the second parameter is the variance (you seem to prefer the standard deviation)
  • $\frac1n \sum_{i=1}^n \mathbf{1}(X_i \leq x) \sim N\left(\frac 12 + \frac 1{\sqrt{2 \pi n}}, \frac 1{4n}\right)$
  • $\sqrt{n}\left(\frac1n \sum_{i=1}^n \mathbf{1}(X_i \leq x) -\frac12\right)\sim N\left(\frac 1{\sqrt{2 \pi}}, \frac 1{4}\right)$

As an illustration in R with $n=100$, remembering that we are in fact looking at a binomial random variable with adjusted scale and location, so discrete with gaps between the values of $\frac1{\sqrt{n}}$, the simulation give mean and variance close to the predicted and a shape close to the corresponding normal distribution:

set.seed(2020)
n <- 100
cases <- 10^5
Y <- numeric(cases)
for (i in 1:cases){
  Y[i] <- sqrt(n)*((1/n)*sum(rnorm(n) <= 1/sqrt(n)) -1/2)
  }
c(mean(Y), var(Y))
# 0.3972250 0.2489453
c(1/sqrt(2*pi), 1/4)
# 0.3989423 0.2500000
plot(table(Y)/cases, ylab="proportion")
curve(dnorm(x, 1/sqrt(2*pi), sqrt(1/4)) / 1/sqrt(n), col="red", add=TRUE)

giving

enter image description here