Law of large numbers and a different model for the average of IID trials

134 Views Asked by At

Let $(X, \mathcal{A}, \mu)$ be a probability space. Let $\boldsymbol{\psi} = (\psi_n: X \rightarrow \mathbb{R})_{n \ge 0}$ be an IID sequence of random variables such that $\mathbb{E}(\psi_0) \in \mathbb{R}$ and $var(\psi_0) \in \mathbb{R}$.

The law of large numbers says that the sequence of random variables $(A(\boldsymbol{\psi},n): X \rightarrow \mathbb{R})_{n \ge 0}$, where $A(\boldsymbol{\psi},n)(x) = \frac{\psi_0(x) + \ldots + \psi_{n-1}(x)}{n}$, converges to $\int \psi_0 d\mu$, $\mu$-a.e..

Now consider the following model to the average of a finite number of trials of an IID sequence of random variables: $(B(\boldsymbol{\psi},n): X^\mathbb{N} \rightarrow \mathbb{R})_{n \ge 0}$, given by $B(\boldsymbol{\psi},n)(x_0, \ldots, x_{n-1}, \ldots) = \frac{\psi_0(x_0) + \ldots + \psi_{n-1}(x_{n-1})}{n}$. Here, $X^\mathbb{N}$ is to be equipped with the product $\sigma$-algebra $\hat{\mathcal{A}}=\bigotimes_{j=0}^{\infty} \mathcal{A}$ and measure $\hat{\mu}=\bigotimes_{j=0}^{\infty} \mu$.

For me, it seems clear (if false, please let me knwon, I have not done the maths, just argued mentally) that the distributions of $A(\boldsymbol{\psi},n)$ (namely, $A(\boldsymbol{\psi},n)_* \mu)$ and $B(\boldsymbol{\psi},n)$ (namely, $B(\boldsymbol{\psi},n)_* \hat{\mu})$, are equal.

My questions are:

1) Probabilistically speaking, what are the relevant differences between $(A(\boldsymbol{\psi},n))_{n \ge 0}$ and $(B(\boldsymbol{\psi},n))_{n \ge 0}$?

2) What kind of convergence can we obtain for $(B(\boldsymbol{\psi},n))_{n \ge 0}$?

3) Does the answer to (2) involve putting the law of large number in a different clothing? Which one?

I appreciate your thoughts and hope you enjoy the question! Lucas A.

2

There are 2 best solutions below

2
On

This is a good question! I think the two models you considered are actually equivalent. A natural way to construct IID random variables is to construct an infinite product space $X^\mathbb{N}$ and have the $\psi_n$'s be determined by the "$n^{th}$ copy of X". It should be intuitive that the resulting $\psi_n$'s will be independent, and we can easily make them IID by choosing their value the same way, for each copy of $X$. So we can naturally discuss convergence in the finite model you gave $(X^n, \hat{A},\hat{\mu})$ since each of these naturally embeds in the larger space $X^\mathbb{N}$ (with the product sigma-algebra and product measure). So the short answer is: you started the explicit construction of IID random variables!

0
On

We can rephrase the problem and attempt a solution this way:

Let $(X, \mathcal{A}, \mu)$ be a probability space. Let $\boldsymbol{\psi} = (\psi_n: X \rightarrow \mathbb{R})_{n \ge 0}$ be an IID sequence of random variables such that $\mathbb{E}(\psi_0) \in \mathbb{R}$ and $var(\psi_0) \in \mathbb{R}$.

Let $(X^\mathbb{N}, \hat{\mathcal{A}}, \hat{\mu})$ be the product space. Define $\boldsymbol{\varphi} = (\varphi_n: X^\mathbb{N} \rightarrow \mathbb{R})_{n \ge 0}$ given by $\varphi_n(x_0, x_1, \ldots) = \psi_n(x_n)$. These should be IID (need to check).

Consider the sequence of random variables $(A(\boldsymbol{\psi},n): X \rightarrow \mathbb{R})_{n \ge 0}$ given by $A(\boldsymbol{\psi},n): x \mapsto \frac{\psi_0(x) + \ldots + \psi_{n-1}(x)}{n}$.

Now note that the sequence of random variables $(B(\boldsymbol{\psi},n): X^\mathbb{N} \rightarrow \mathbb{R})_{n \ge 0}$ given by $B(\boldsymbol{\psi},n): \hat{x}=(x_0, \ldots, x_{n-1}, \ldots) \mapsto \frac{\psi_0(x_0) + \ldots + \psi_{n-1}(x_{n-1})}{n}$, is precisely $(B(\boldsymbol{\psi},n))_{n \ge 0} = (A(\boldsymbol{\varphi},n))_{n \ge 0}$.

Since we know $(\psi_n)_{n \ge 0}$ is $\mu$-IID and $(\varphi_n)_{n \ge 0}$ is $\hat{\mu}$-IID, the law of large numbers guarantees that:

(I) $A(\boldsymbol{\psi},n)(x) = \frac{\psi_0(x) + \ldots + \psi_{n-1}(x)}{n}$ converges $\mu$-a.e. to $\int \psi_0 d\mu$ and

(II) $A(\boldsymbol{\varphi},n)(\hat{x}) = B(\boldsymbol{\psi},n)(\hat{x}) = \frac{\psi_0(x_0) + \ldots + \psi_{n-1}(x_{n-1})}{n}$ converges $\hat{\mu}$-a.e. to $\int \varphi_0 d \hat{\mu} = \int \psi_0 d\mu$.

So after some perfunctory work, the law of large numbers guarantees both sequences will converge almost everywhere to the same number.

What is still not clear to me is that these are models of the same probabilistic problem. I can make the diagram with the $\varphi_n$ and $\psi_n$ semi-commute using the measure-preserving projection $\pi_n (\hat{x}) = x_n$. But in not sure what to do about the diagram with $B(\boldsymbol{\psi},n)=A(\boldsymbol{\varphi},n)$ and $A(\boldsymbol{\psi},n)$. To be honest, maybe I'm not sure what a probabilist would understand is an isomorphism in this context.