Central limit theorem without independence

390 Views Asked by At

Let $(X_n)_n$ be a sequence of i.i.d random variables following the uniform distribution $\mathcal{U}([0;1]).$ Let $f:\mathbb{R}^3 \rightarrow\mathbb{R},$ be a continuous function.

Prove that $$W_n=\frac{1}{\sqrt{n}}(\sum_{k=1}^nf(X_k,X_{k+1},X_{k+2})-n\int_{[0;1]^3}f(x,y,w)dxdydw)$$ converges in distribution.

The exercise seems to be an application of the central limit theorem, since $Y_n=f(X_n,X_{n+1},X_{n+2})-E[f(X_1,X_2,X_3)]$ have the same distribution, but the problem is that $(Y_n)_n$ are not independent, I tried to write $$W_n=\frac{1}{\sqrt{n}}\sum_{k=1}^{\left \lfloor{\frac{n}{3}}\right \rfloor}Y_{3k}+\frac{1}{\sqrt{n}}\sum_{k=0}^{\left \lfloor{\frac{n-1}{3}}\right \rfloor}Y_{3k+1}+\frac{1}{\sqrt{n}}\sum_{k=0}^{\left \lfloor{\frac{n-2}{3}}\right \rfloor}Y_{3k+2},$$ This way, $(Y_{3n})_n,(Y_{3n+1})_n,(Y_{3n+2})_n$ are independent, so we can apply the central limit theorem, but in the expression of $W_n$ the considered random variables aren't independent so that we can say that the sum converges in distribution.

Is there a way to solve the problem? And more generally, the result remains true if $(X_n)_n$ is i.i.d without knowing the related distribution?

This exercise is inspired from problem 10, page 156, from Theory of Probability and Random Processes. Maybe there is a way using Markov chains and ergodicity, but this is not in the program, since in the mentioned book, Markov chains and related results are explained, later, in other chapters.

enter image description here

1

There are 1 best solutions below

2
On

Let $Y_i:=f(X_i,X_{i+1},X_{i+2})$ and $\widetilde{Y}_i:=Y_i-\mathsf{E}Y_i$. It is clear that $Y_i$ and $Y_{i+j}$ are independent for $j\ge 3$. Split the sum of $\widetilde{Y}_i$'s into $K$ blocks of size $p$ with gaps of size $2$ between each pair of consecutive blocks, i.e., write \begin{align} \sum_{i=1}^{n} \widetilde{Y}_i&=(\widetilde{Y}_1+\cdots+\widetilde{Y}_p)+(\widetilde{Y}_{p+1}+\widetilde{Y}_{p+2})+(\widetilde{Y}_{p+3}+\cdots+\widetilde{Y}_{2p+2})+\cdots\\ &=\sum_{j=1}^K B_{n,j}+\sum_{j=1}^K S_{n,j}+R_n, \end{align} where $B_{n,j}:=(\widetilde{Y}_{(j-1)p+2j-1}+\cdots+\widetilde{Y}_{jp+2(j-1)})$, $S_{n,j}=\widetilde{Y}_{jp+2j-1}+\widetilde{Y}_{jp+2j}$, and $R_n$ is the remainder term.


Choose $K=\ln(n)$ and $p=n/\ln(n)$. Since $\{B_{n,j}\}_{j=1}^K$ is a triangular array of centered i.i.d. random variables, we can apply the central limit theorem (for triangular arrays). The approximation error consists of at most $3\ln(n)$ terms ($K$ terms of size $2$ and at most $\ln(n)$ terms in $R_n$) and is asymptotically negligible.