I came across this quote about the Standard Error in the OpenIntro Statistics book:
We required a large sample in Chapter 4 for two reasons:
1. The sampling distribution of x tends to be more normal when the sample is large.
2. The calculated standard error is typically very accurate when using a large sample.So what should we do when the sample size is small? As we’ll discuss in Section 5.1.1, if the population data are nearly normal, then x ̄ will also follow a normal distribution, which addresses the first problem. The accuracy of the standard error is trickier, and for this challenge we’ll introduce a new distribution called the t-distribution.
I thought the standard error, defined as $ SE = \frac{s}{\sqrt{n}}$ when we don't know $\sigma$ (the population standard deviation), just gets smaller as n gets bigger. What does it mean that it gets more accurate when using a large sample?
I mean, I read about the CLT and this:
Conditions for x ̄ being nearly normal and SE being accurate...Important conditions to help ensure the sampling distribution of x ̄ is nearly normal and the estimate of SE sufficiently accurate:
• The sample observations are independent.
• The sample size is large: n > 30 is a good rule of thumb.
• The population distribution is not strongly skewed. This condition can be difficult to evaluate, so just use your best judgement.
It seems like the reason to use a large sample is because since the standard error, which is a measure of the uncertainty associated with the point estimate, depends on $s$. And I guess $s$ gets more accurate of the population standard deviation as n increases? Is that right?
I guess this quote is enlightening:
There is one subtle issue in the equation for SE: the population standard deviation is typically unknown. You might have already guessed how to resolve this problem: we can use the point estimate of the standard deviation from the sample. This estimate tends to be sufficiently good when the sample size is at least 30 and the population distribution is not strongly skewed.
So that's it right? As n, the sample size gets bigger, we can be more sure that the sample's standard deviation more accurate reflect the population standard deviation. So s becomes a more reliable estimate as n gets bigger... and so SE, the sampling distribution of the mean, also becomes more accurate (which means... a true estimate of the real standard error). Is what I wrote correct?
Yes, you are correct, and that's the intuition.
And all discussion in the question is fundamentally backed by the following three theorems: