On the relevance of CLT to the distributions of sums of i.i.d. random variables

466 Views Asked by At

The central limit theorem states (subtle convergence issues aside) that for i.i.d random variables $X_i$ with mean $\mu$ and variance $\sigma^2$, $$\frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \to \mathcal N(0,1).$$ Now for large $n$, can we say $$\frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \sim \mathcal N(0,1),$$ $$\sum_{i=1}^n X_i - n\mu \sim \mathcal N(0,\sigma^2n),$$ $$\sum_{i=1}^n X_i\sim \mathcal N(n\mu,\sigma^2n)?$$ Seems like this is a very simple closed form approximation to the distribution of the sum of i.i.d random variables.

1

There are 1 best solutions below

0
On

Intuitively you are right. Formally, of course, the only rigorous way to capture and proof your claim is to rewrite it in terms of what you started with: the CLT.

Some intuition why your claim is correct: Assume that the mean $\mu $ is zero. This means that the probability density of $ X $ has some parts to the left and some parts to the right of zero. Since the density of the sum of random variables is the convolution of their densities (ask, if you don't have an intuition why), the density of the sum of two of your random variables is again centered around zero, but is a bit flattened and smoothed out (assume the original variable was uniform, then the distribution of the sum of two is a hat function. Like with a die: all numbers on it are uniform but when you throw two dice, the 7 is most likely and the 12 almost never occurs). When you add more and more variables, the distribution gets more and more flattened and smoothed. You might guess at this point that the limit of this process, if it exists, must be something that doesn't change when convoluted with itself. Well, and it turns out that the normal distribution doesn't. (CAVEAT: the normal distribution doesn't change its general shape, but does change its variance. This is why the division by $\sqrt{n} $ is needed. In other words the limit of convolving a function with itself over and over again does not exist, but it does when we change the width of the resulting densities)