Glivenko-Cantelli for $k$-points

121 Views Asked by At

Let $\mu$ be a probability measure on $\mathbb{R}$. Let $f$ be a measurable function on $\mathbb{R}$. Let $(X_i)_{i \in \mathbb{N}}$ be a sequence of random iid variables, all of law $\mu$. Then Glivenko-Cantelli's theorem allows us to derive the convergence in law $\frac{1}{n} \sum^n_{i=1} \delta_{f(X_i)} \stackrel{n \to \infty}{\rightarrow} f_* \mu$.

Do we have a $2$-point Glivenko-Cantelli's theorem? That is, let $f$ be a measurable function on $\mathbb{R}\times \mathbb{R}$. Do we have the convergence in law $\frac{1}{n(n-1)} \sum_{\substack{i,j \in \{1,\cdots,n\}\\ i\neq j}} \delta_{f(X_i,X_j)} \stackrel{n \to \infty}{\rightarrow} f_* (\mu \otimes \mu)$?

And a $k$-point Glivenko-Cantelli's theorem?

I don't know if it is possible to prove $2$-point Glivenko-Cantelli from the usual one, since the family $(X_i,X_j)_{\substack{i,j \in \mathbb{N}\\ i\neq j}}$ is not independent. But still, most of the pairs of members of the family are independent pairs...

1

There are 1 best solutions below

6
On

The answer is positive, though I don't see any reason to speak of Glivenko-Cantelli here, as it concerns uniform convergence.

For notational simplicity, I will consider $m=2$. We want to show that there is some event of full probability where for each $g\in C_b(\mathbb R)$ $$ S_n(g) := \frac{1}{n(n-1)}\sum_{1\le i\neq j\le n} g\big(f(X_i,X_j)\big)\to \mathbb{E}[g\big(f(X_1,X_2)\big)]=:E(g), n\to\infty. $$ Denoting $h=g(f)$, the almost sure convergence here follows from the strong law of large numbers for $U$-statistics. However, the issue, as in the case $m=1$, is with exceptional sets depending on $g$. And as for $m=1$, this issue is resolved in two steps:

Step 1. Localization. Denote $g_N(y) = g(y)\mathbf 1_{|y|\le N}$. Write $$ |S_n(g) - S_n(g_N)|\le ||g||_\infty \cdot S_n(\mathbf 1_{|y|>N}), $$ whence (appealing to the SLLN for $U$-statistics) $$ \limsup_{n\to \infty} |S_n(g) - S_n(g_N)|\le ||g||_\infty \cdot \mathbb{P}(|f(X_1,X_2)|>N) $$ almost surely for every integer $N\ge 1$. Since there are countably many, we can choose an exceptional event independent of $N$ (not that it does not depend on $g$). Also we have $|E(g) - E(g_N)|\le ||g||_\infty \cdot \mathbb{P}(|f(X_1,X_2)|>N)$, so $$ \limsup_{n\to \infty} |S_n(g) - E(g)|\le \limsup_{n\to \infty} |S_n(g_N) - E(g_N)| + 2||g||_\infty \cdot \mathbb{P}(|f(X_1,X_2)|>N). $$ Since $\mathbb{P}(|f(X_1,X_2)|>N)\to 0$, $N\to\infty$, it suffices to show that $S_n(g_N) \to E(g_N)$, $n\to\infty$, on some event of full probability independent of $g$ and $N$.

Step 2. Separability. For each $N\ge 1$, choose a dense countable subset $\mathcal C_N \subset C([-N,N])$. Thanks to the SLLN for $U$-statistics, for each $h\in \mathcal C_N$, $S_n(h) \to E(h)$, $n\to\infty$, almost surely$^*$. Since $\bigcup_{N\ge 1}\mathcal C_N$ is countable, the convergence holds on some $\Omega'$ having full probability and independent of $h$ and $N$. Choosing for any function $g\in C([-N,N])$ and any $\varepsilon>0$ a function $h\in \mathcal C_N$ with $||g-h||_\infty<\varepsilon$, as in Step 1, we get $$ \limsup_{n\to\infty} |S_n(g) - E(g)| \le \limsup_{n\to\infty} |S_n(h) - E(h)| + 2\varepsilon = 2\varepsilon $$ on $\Omega'$, whence we conclude by letting $\varepsilon\to 0$.


$^*$ Here is a slight abuse of notation, as in this step I speak rather of $S_n(\tilde h)$ where $\tilde h = h$ on $[-N,N]$ and $0$ elsewhere.