Convergence in distribution of empirical cdf almost surely.

261 Views Asked by At

Given i.i.d. random variables $\{X_n\}$, define the empirical cdf $$ \hat{F}_n(\omega, x) = \frac{1}{n} \sum_{i = 1}^n \mathbf{1}_{X_i(\omega) \leq x}$$ where $\omega \in \Omega$ and $x \in \mathbb{R}$. Show that $\hat{F}_n(\cdot, \cdot) \overset{d}{\to} F$ as $n \to \infty$, i.e. $$ P\left(\left\{\omega : \lim_{n \to \infty} \hat{F}_n(\omega, x) = F(x)\text{ for every } x \in C(F)\right\}\right)=1$$

I know that each $\hat{F}_n(\cdot, x)$ is a random variable, and each $\hat{F}_n(\omega, \cdot)$ is a cdf. Furthermore, for fixed $x \in \mathbb{R}$, we have $\hat{F}_n(\cdot, x) \overset{as}{\to} F$ by the Strong Law of Large Numbers; thus, each $$ P\left(\left\{\omega : \lim_{n \to \infty} \hat{F}_n(\omega, x) = F(x)\right\}\right) := P(A_x) = 1. $$

The problem is extending this to almost all $x$. If I only had to deal with countably many $x$, then I could just take a countable union of $A_x^c$ and obtain the result. It has been hinted that the problem can be reduced to this case, but I am still unable to figure out how.

2

There are 2 best solutions below

3
On BEST ANSWER

The following is a concise proof of Glivenko-Cantelli Theorem, it gives the uniform convergence, i.e., \begin{equation*} \mathsf{P}(\lim_{n\to\infty}\sup_t|F_n(t)-F(t)|=0)=1. \tag{1} \end{equation*} which also includes what you want.

By the strong law of Large numbers, as $n\to\infty$ \begin{gather*} F_n(t)=\frac1n\sum_{i=1}^{n}I_{\{X_i\le t\}} \stackrel{\text{a.s.}}{\longrightarrow}\mathsf{E}[I_{\{X_1\le t\}}]=F(t), \quad \forall t\in\mathbb{R}, \tag{2}\\ F_n(t-)=\frac1n\sum_{i=1}^{n}I_{\{X_i< t\}} \stackrel{\text{a.s.}}{\longrightarrow}\mathsf{E}[I_{\{X_1< t\}}]=F(t-), \quad \forall t\in\mathbb{R}. \tag{3} \end{gather*}

Given a fixed $\varepsilon>0$, take $$ t_0=-\infty, \quad t_i= \sup\{t: F(t)-F(t_{i-1})\le \varepsilon \}, \quad i\ge 1, $$ then there exist finite $k\le 1/\epsilon+1 $ and $-\infty=t_0<t_1<\cdots<t_k=\infty$ satisfy $$ F(t_i-)-F(t_{i-1})\le \epsilon,\qquad 1\le i \le k.$$ Now, for $t_{i-1}\le t<t_i$, \begin{gather*} F_n(t_{i-1})\le F_n(t)\le F_n(t_i-),\qquad F(t_{i-1})\le F(t)\le F(t_i-)\\ F_n(t)-F(t)\le F_n(t_i-) -F(t_i-)+\varepsilon,\\ F(t)-F_n(t)\le F(t_{i-1})- F_n(t_{i-1}) + \varepsilon. \end{gather*} Furthermore, \begin{align*} &\sup_{t\in\mathbb{R}}|F_n(t)-F(t)| \\ &\quad\le \sup_{1\le i\le k} [|F_n(t_{i-1})-F(t_{i-1})|+F_n(t_i-)-F(t_i-)] + \varepsilon \end{align*} Now from (2),(3) easy to get \begin{equation*} \mathsf{P}\Big(\varlimsup_{n\to\infty} \sup_{t\in\mathbb{R}}|F_n(t)-F(t)|\le \varepsilon\Big)=1. \end{equation*} Hence, (1) is also true.

0
On

While the existing answer provides a valid route to proving the desired result via the Glivenko-Cantelli theorem, I would like to offer a more simpler approach that avoids some of the technicalities involved in the theorem's proof.

You have already shown (using Kolmogorov's SLLN) that given $x \in \mathbb R,$ there exists $A_x \subseteq \Omega$ with $\mathbb P \left (A_x \right ) = 1,$ such that for all $\omega \in A_x,$ $\lim\limits_{n \to \infty} \widehat {F_n} (\omega, x) = F(x).$ Let $A = \bigcap\limits_{x \in \mathbb Q} A_x.$ Since $\mathbb Q$ is countable, it follows that $\mathbb P (A) = 1.$ So $\lim\limits_{n \to \infty} \widehat {F_n} (\omega, x) = F(x),$ for all $\omega \in A$ and for all $x \in \mathbb Q.$ Now let $x_0 \in C(F)$ and choose $s,t \in \mathbb Q,$ such that $s \lt x_0 \lt t.$ Since $x \mapsto \widehat {F_n} (\omega, x)$ is a cdf for each $\omega \in \Omega,$ it follows that $\widehat {F_n} (\omega, s) \leq \widehat {F_n} (\omega, x_0) \leq \widehat {F_n} (\omega, t),$ for all $\omega \in A.$ Thus for all $\omega \in A$ we have $$F(s) = \lim\limits_{n \to \infty} \widehat {F_n} (\omega, s) \leq \liminf\limits_{n \to \infty} \widehat {F_n} (\omega, x_0) \leq \limsup\limits_{n \to \infty} \widehat {F_n} (\omega, x_0) \leq \lim\limits_{n \to \infty} \widehat {F_n} (\omega, t) = F(t).$$ This is true for any $s, t \in \mathbb Q$ with $s \lt x \lt t.$ Since $x_0 \in C(F),$ letting $s \to x_0^-$ and $t \to x_0^+,$ we have $$\lim\limits_{n \to \infty} \widehat {F_n} (\omega, x_0) = F(x_0)$$ for all $\omega \in A.$ This shows that $$A \subseteq \left \{\omega \in \Omega\ :\ \lim\limits_{n \to \infty} \widehat {F_n} (\omega, x) = 0\ \text {for all}\ x \in C(F) \right \}.$$ Since $\mathbb P (A) = 1,$ the required result follows. $\square$