Law of large numbers for a sequence of random variables

749 Views Asked by At

Suppose we have a sequence of random variables $X^M$ which converges almost surely to a random variable $X^0$ and let $(X_1^M, \ldots, X_M^M)$ be iid samples from $X^M$ for $M \in \mathbb{N}$.

Under which conditions does the law of large numbers hold uniformly in the sense that

$$ |M^{-1} \sum_{m=1}^M X_m^M - \mathbb{E}(X^0)| \xrightarrow{M \to \infty} 0 \; \text{ almost surely?}$$

I am happy to assume that the $X^m$ are uniformly bounded, i.e., that there is a constant $K$ such that

$$ |X^m| \leq K, \; \text{almost surely for all } m \in \mathbb{N}_0.$$

I looked into uniform LLNs but they generally do not seem to fit the setting above.

2

There are 2 best solutions below

3
On BEST ANSWER

Assuming uniform integrability of $X^m$, we have that $\mathrm E[X^m]\to \mathrm E[X]$, $m\to\infty$. Therefore, it is enough to show that $$ \left| (\overline{X^m})_m - \mathrm E[X^m]\right|\to 0, m\to \infty,\tag{1} $$ almost surely, where $(\overline{X^m})_m = \frac1m \sum_{i=1}^m X_i^m$.

One possibility is to go through concentration inequalities. For example, if the variables are bounded, as in your question, then by the Hoeffding inequality, for any $\varepsilon>0$, $$ \mathrm P\left(\left| (\overline{X^m})_m - \mathrm E[X^m]\right|>\varepsilon\right)\le e^{-C \varepsilon^2 m} $$ with some $C>0$. Using the Borel-Cantelli lemma, we easily get $(1)$.

Another possibility is, as I commented, to deduce the uniform convergence $$ \sup_m \left| (\overline{X^m})_n - \mathrm E[X^m]\right|\to 0, n\to \infty,\tag{2} $$ from the uniform law of large numbers. However, it seems unlikely that the almost sure convergence can be shown this way; I will only outline the convergence in probability.

Let $F^m$ be the cdf of $X^m$ and $Q^m(t) = \sup\{x\in \mathbb R: F^m(x)<t\}, t\in(0,1)$, be its quasi-inverse (quantile function). Then, as it is well known, $X^m \overset{d}{=} Q^m(U)$, where $U$ is a uniform $[0,1]$ variable. Therefore, $$ (\overline{X^m})_n \overset{d}{=} \frac1n \sum_{k=1}^n Q^m(U_k), $$ where $U_1,U_2,\dots$ are iid uniform $[0,1]$ variables. Also it follows from the weak convergence of $X^m\to X^0$ that $Q^m\to Q^0$ pointwise in the continuity points of $Q^0$, hence, almost everywhere on $(0,1)$.

Now let $\Theta = \{m^{-1}, m\ge 1\}\cup \{0\}$ and set $f(t,m^{-1}) = Q^m(t)$, $m\ge 1$, $f(t,0) = Q^0(t)$. Then, as is explained above, $f(t,\theta)$ is continuous in $\theta$ for almost all $t$ (modulo the distribution of $U$). Therefore, assuming existence of integrable majorant of $f(U,m^{-1})=Q^m(U)$ (which is easily seen to be equivalent to uniform integrability of $X^m$), we get that $$ \sup_{\theta\in \Theta}\left| \frac1n \sum_{k=1}^n f(U_k,\theta) - \mathrm{E}[f(U,\theta)]\right| \to 0, n\to \infty, $$ almost surely, whence we get the convergence $(2)$ in probability (remember that we replaced $(\overline{X^m})_n$ by its distributional copy).

The convergence in probability might sound bad, but there are at least two advantages:

  1. Only uniform integrability is required.

  2. The approach works for any $(n_m,m\ge 1)$ such that $n_m\to\infty$, $m\to\infty$, i.e. we have $$ \left| (\overline{X^m})_{n_m} - \mathrm E[X^m]\right|\to 0, m\to \infty, $$ in probability. The first approach fails (to establish the almost sure convergence) for "small" $n_m$.

1
On

In the case I am considering the random variable $X^M$ can actually be written as function of $X^0$ and $M$, i.e., $X^M = f(X^0, M)$ for all $M \in \mathbb{N}$.

A priori, this function is only defined on the integers. However, I could extend it to a function $f: \mathbb{R} \times [0,1] \to \mathbb{R}$ by defining $$ f(x, \theta) = \begin{cases} f(x,M), & \text{if } \theta = M^{-1}, \text{ for } M \in \mathbb{N} \\ \frac{\lambda}{M^{-1}-(M+1)^{-1}} f(x,M) + \frac{1-\lambda}{M^{-1}-(M+1)^{-1}} f(x,M+1), & \text{for } \theta = \frac{\lambda M^{-1}}{M^{-1}-(M+1)^{-1}} + \frac{(1-\lambda)(M+1)^{-1}}{M^{-1}-(M+1)^{-1}} \end{cases}.$$

By my assumption on the almost sure convergence $X^M \to X^0$, it follows that $$ f(x, \theta) \xrightarrow{\theta \to 0} f(x, 0)$$ and the continuity of $f$ at all other points $\theta \in (0,1]$ follows from the way $f$ is defined.

Hence, in this case the conditions for the uniform LLN are fulfilled and it would follow that $$\sup_{\theta \in [0,1]} \left| n^{-1} \sum_{i=1}^n f(X_i^0, \theta) - \mathbb{E} (f(X^0, \theta)) \right| \xrightarrow{n \to \infty} 0 \quad \text{almost surely.}$$ This would imply $$\sup_{M \in \mathbb{N}} \left| n^{-1} \sum_{i=1}^n f(X_i^0, M^{-1}) - \mathbb{E} (f(X^0, M^{-1})) \right| = \sup_{M \in \mathbb{N}} \left| n^{-1} \sum_{i=1}^n X_i^M - \mathbb{E} (X_i^M) \right| \xrightarrow{n \to \infty} 0 \quad \text{almost surely.}$$

Hence, for every $\epsilon > 0$, there is a $N(\epsilon)$ such that $$\left| n^{-1} \sum_{i=1}^n X_i^M - \mathbb{E} (X_i^M) \right| < \epsilon, \quad \forall n \geq N(\epsilon),$$ which means that for $M \geq N(\epsilon)$, we have $$\left| M^{-1} \sum_{i=1}^M X_i^M - \mathbb{E} (X_i^M) \right| < \epsilon \quad \text{almost surely.}$$

As mentioned in the post by zhoraster above, $$ |\mathbb{E}(X^M) - \mathbb{E}(X^0)| \xrightarrow{M \to \infty} 0 \quad \text{almost surely}$$ using the boundedness assumption on the $X^M$ (or indeed the weaker condition of uniform integrability).

Putting these two things together establishes (via an application of the triangle inequality) that $$ |M^{-1} \sum_{m=1}^M X_m^M - \mathbb{E}(X^0)| \xrightarrow{M \to \infty} 0 \; \text{ almost surely.}$$