Standard deviation of the mean of sample data

5.7k Views Asked by At

I can't quite understand what this formula means:

$$\sigma_{\overline{x}}=\frac{\sigma}{\sqrt n}$$

I know what standard deviation $\sigma$ is - it's the average distance of my data points (samples) from the mean. But this part is confusing:

For example, suppose the random variable $X$ records a randomly selected student's score on a national test, where the population distribution for the score is normal with mean $70$ and standard deviation $5$ ($N(70,5)$). Given a simple random sample (SRS) of $200$ students, the distribution of the sample mean score has mean $70$ and standard deviation $$\frac{5}{\sqrt{200}} \approx \frac{5}{14.14} \approx 0.35$$

Source

I thought the standard deviation $\sigma = 5$ means that if I take the scores of all students and calculate the mean, then the average distance of a score from that mean will be equal to $5$. The set of all scores is called the 'population', right? But here it says the more students' scores I take, the lower the standard deviation - thus the closer the number of samples gets to the size of population, the lower the standard deviation (and its get further from $5$).

3

There are 3 best solutions below

2
On BEST ANSWER

First, the standard deviation is not the average distance to the mean, that is always zero. It is however, a value to measure how far the points are from the mean or not. Assuming the values are normally distributed, we know that ~68% of the values are between $\mu-\sigma$ and $\mu+\sigma$, for example.

Suppose we weigh potatoes with average weight 100 g and stadard deviation 5 g. What does hold for the average of the average weight of a group of 4 potatoes? I hope you see that the average of the average weight is still 100 g. But what is the standard deviation of this average weight? That is where you use the formula

$$\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{5}{\sqrt{4}} =2.5$$

Feel free to ask if you still don't understand.


Proof that the average distance between the actual data and the mean is $0$: $$\frac{\sum^n_{i=1} (x_i-\mu)}{n} = \frac{(\sum^n_{i=1} x_i)-\mu n}{n} = \frac{\sum^n_{i=1} x_i}{n}-\mu = \mu - \mu = 0$$
2
On

Look carefully at the last sentence in the quote: in it 'standard deviation' refers to that of the sample mean. Thus one is essentially looking at all possible samples of 200 students, given that the population's standard deviation is 5. wythagoras' answer provides the formula for the sample mean's standard deviation.

2
On

You are computing the standard deviation of the mean, $\sigma_{\bar X}$, not that of the individual samples, $\sigma_X$.

When the variables are independent, the variances do add up. So $$\text{var}_{\sum_i Xi}=n\text{var}_X,$$ and dividing by $n^2$ (the variance is quadratic), $$\text{var}_{\bar X}=\frac1n\text{var}_X.$$

Hence taking the square root

$$\sigma_{\bar X}=\frac1{\sqrt n}\sigma_X.$$