Central Limit Theorem without IID?

757 Views Asked by At

Consider the Central Limit Theorem: Suppose that $X_1, X_2, \ldots, X_n$ are independent and identically distributed random variables with mean $\mu$ and variance $\sigma^2 < \infty$. Let $\bar{X} = (X_1 + X_2 + \cdots + X_n)/n$ be the sample mean. Then, as $n \rightarrow \infty$, the distribution of $\sqrt{n}(\bar{X} - \mu)/\sigma$ converges in distribution to a standard normal distribution, i.e.,

$$\lim_{n \rightarrow \infty} \mathbb{P} \left( \sqrt{n} \frac{\bar{X} - \mu}{\sigma} \leq x \right) = \Phi(x)$$

where $\Phi(x)$ is the cumulative distribution function of the standard normal distribution.

In introductory math classes, we are often told that the Central Limit Theorem is only applicable for when $X_1, X_2, \ldots, X_n$ are independent and identically distributed (iid) - yet the importance for this condition is not really explained. I am trying to understand why this iid condition is so important.

While trying to learn more the importance of this iid condition, I came upon variants of the Central Limit Theorem in which this condition is partly relaxed (e.g. https://en.wikipedia.org/wiki/Lindeberg%27s_condition , https://en.wikipedia.org/wiki/Central_limit_theorem#Lyapunov_CLT) - but I still could not find an explanation as to why the results of the Classic Central Limit Theorem might not be applicable for the non-iid case.

As an example - is it possible to construct an example in which $X_1, X_2, \ldots, X_n$ are created such that they are deliberately non-iid (e.g. auto-correlation), and then demonstrate that in this example, $\sqrt{n}(\bar{X} - \mu)/\sigma$ WILL NOT converge to a Standard Normal Distribution?

Thanks!

3

There are 3 best solutions below

1
On

A trivial example for not-independent variables is when $X_1$ is Uniform on $[-1,1]$ and all $X_2, X_3. \cdots$ coincide with $X_1$ ($X_k=X_1)$

An example for independent but not identically distributed (from here). Let $Z_k$ be iid Uniform on $[-1,1]$ and let $X_k = a^k Z_k $ for some $0<a<1$. In this case, $X_k$ are independent, uniform on $[-a^k,a^k]$, so that $E[X_k]=0$ and $\sigma_k^2= \frac13 a^{-2k}$. Because $\sum X_k$ is limited to the range $(−1/(1−a),1/(1-a))$, the sum cannot converge to a Gaussian distribution.

Another example here

2
On

Great question. With respect to an example with autocorrelation that $\sqrt{n}(\bar{X}-\mu)$ does not converge to a standard normal, consider the following MA(1) model and its autocovariances \begin{align} X_t&=\theta Z_{t-1}+Z_t \;\;\; Z_t \sim N(0,\sigma_Z^2)\\ \sigma_X^2&=\gamma(0)=(\theta^2+1)\sigma_Z^2\\ \gamma(1)&=\theta\sigma_Z^2\\ \gamma(k)&=0 \; \forall \; k>1. \end{align}

The non-zero autocorrelation means that in the limit the variance of $\sqrt{n}(\bar{X}-\mu)$ is no longer $\sigma^2_X=(\theta^2+1)\sigma^2_Z$ but is instead $\gamma(0)+2\sum_{j=1}^\infty \gamma(j)= \sigma^2_Z(\theta+1)^2$.

This concept is referred to as long-run variance, and is part of what makes analyzing time-series so difficult. I will also add that there have been extensions to CLT which can handle certain forms of time dependence (see here for example).

0
On

Perhaps you can try a simulation-based approach?

  • Using Python you can generate random sequences of data such that each new number is related via some autocorrelation function.
  • Then, you evaluate (and record) the sample mean of this random sequence.
  • Then, you repeat these first two steps many times and plot a histogram of the distribution for these sample means
  • Finally, by studying the EDF (Empirical Distribution Function) of this sample mean (e.g. with Kolmogorov-Smirnov), you can compare this EDF with the expected EDF of theoretical Standard Normal Distribution
  • If these two EDFs are statistically different from one another, you can conclude that empirically, the Central Limit Theorem does not always apply in the absence of the iid condition