Weak Law of Large Numbers for a non-iid, non-ergodic sequence

2.1k Views Asked by At

I have a somewhat open-ended question. Let's say I have a sequence of random variables $(X_n: n \geq 1)$ which are neither independent, ergodic, nor identically distributed. Normally I would say that I am completely dead in the water, but let's say that $X_n \overset{d}{\to} X$. Are there any additional assumptions under which I can say that:

$$ \frac{1}{n} \sum_{i=1}^n X_n \;\overset{P}{\to}\; \mathbb{E}X $$

Even if I assume that expectation of the left-hand side converges to $\mathbb{E}X$, I'm stuck thinking about this more generally. Any tips?

EDIT: Thinking about this some more, I feel like making a martingale out of the LHS and then checking under what conditions we have the desired martingale convergence would be a reasonable route to follow. Any thoughts on this?

EDIT 2: per Nate Eldredge's comment below, I need to assume that the expectation of the LHS of the partial-sum object converges to $EX$... it doesn't follow from $X_n \overset{d}{\to} X$.

2

There are 2 best solutions below

3
On

Revised per OP comments

Here is a writeup on Laws of Large Numbers for non-iid rv. I think Law I on pdf page 4 has what you are looking for:

Given a sequence of square integrable rvs, $X_i$, if the variances are bounded and the covariances are negative or bounded in absolute value, then your normalized sum converges to $E[X]$ in probability.

2
On

After doing some reading here, it seems like martingales do offer a general approach.

Assume that expectations of $n^{-1} \sum_{i=1}^n X_i$ converge to $E X$. Then if it is possible to show that the partial sums $n^{-1} \sum_{i=1}^n X_i$ concentrate around their means sufficiently strongly, we can apply a triangle inequality to show that $n^{-1} \sum_{i=1}^n X_i$ converges to $EX$.

The step where martingales enter is showing the concentration. If $\{n^{-1} \sum_{i=1}^n X_i\}_{n \geq 1}$ is a martingale with bounded increments, then Azuma's inequality is a possible solution. This is essentially what is used in a classic paper on preferential attachment random graph models: http://www.ee.columbia.edu/~jiantan/E6083/swdeg.pdf

EDIT: per Nate Eldredge's comment below.