I've been reviewing the Central Limit Theorem in statistics and I'm having trouble intuitively grasping how it can be practically used. The theorem tells us, subject to constraints, that the distribution of the sample mean will be approximately normal, even if the population distribution is not. From what I understand, this remarkable fact now allows us to use inferential statistical methods that apply to the normal distribution to answer questions about the population distribution. But I struggle with intuitively grasping why this is possible? That is, if I want to answer some question about a random variable X of unknown distribution, how does working with the sample mean of X get me to the answer about X? Is there a step missing that allows me to infer properties of X from properties of the sample mean of X?
Grasping the practical usage of Central Limit Theorem
135 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 4 best solutions below
On
You’re right, there seems to be a step missing. If $X_1,...,X_n$ are iid then the CLT says $X_1+...X_n\sim N(n\mu, n\sigma^2)$. It works because the unstated assumption is the sample is iid. You would only be able to tell the mean and variance, but not the type of distribution the sample came from. That’s about it to the “theorem”.
On the other hand, you’re saying that how can you tell anything about the individuals if you have $X_1+...X_n\sim (mean, variance)$, where $X_1$ is number of births and $X_n$ is cigarettes smoked in a day. You can’t.
On
Here is an example which, if you work it through, will make you appreciate CLT. Say, you toss a coin 10,000 and you got heads 5,600 times. The question is this coin fair? Intuitively, number of heads and tails for a fair coin should be about fifty-fifth. In this case we have 5.600 and 4,400 so one would think ... yeah, there is nothing wrong with this coin. But if you take a look at i.i.d random variables with probability of 1 set to $1/2$ and probability of -1 set to $1/2$ and estimate the probability that their sum is bigger or equal to 600 for "large N" (here N = 10,000) [This is were you would use CLT ] you will get that probability of such event is of magnitude 1% (or less). In words, it is highly unlikely that a fair coin would produce 5,600 heads in 10,000 trails. This is how one would use CLT in general. You test a hypothesis by modeling a sequence of i.i.d, use CLT and then compare it with empirical data.
On
You are quite right. In a way, the CLT smoothes out all characteristics of the distribution of $X$, as you always end-up with a normal. All information is lost but $\mu,\sigma$.
On another hand, the CLT is very powerful in studying the properties of the mean of a sample quantitatively, even without knowledge of the distribution. Like Chebyshev's inequality, it is universal.
Many probability distributions can be entirely characterized by a few numbers. For example a gaussian distribution (i.e. bell curve) can be characterized by giving its mean and its variance. Most probability distributions have their mean as one of their parameters, and also when explaining your results to someone it's often effective to talk about what the expected value of something is going to be and with what confidence you can state that. So having the Central Limit Theorem show you how your sample mean relates to the true mean is useful in both characterizing the probability distribution and in explaining to people how confident you are in your result.