Could someone please explain how to apply CLT please? I understand the theory behind it, I'm just not actually sure how to apply this to my data. I have a large data set of 2000 that is non-normal and I've taken a sample of around 50. Do I do my tests on that 50? Or do I carry on taking samples and carry out the tests on the means of my samples? It's only for basic mean and variance comparison tests, I'm just not sure how to actually apply it.
Thank you for your help.
The CLT states that the arithmetic mean (average) of a large number i.i.d (independent and identically distributed) random variables will be roughly normal, regardless of the distribution.
So, given the sample of 50, you can take the arithmetic mean and conduct a z-test (since the sample size is greater than 30).
You could perform a confidence interval on the mean, $$P[-z_{\alpha/2}\leq Z=\dfrac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leq z_{\alpha/2}]=1-\alpha$$