Standard Deviation vs Population Standard Deviation

714 Views Asked by At

In GRE study guide, it gives difference between standard deviation and sample or population standard deviation.

I understand the mechanics of this, i.e. in standard deviation you divide all differences of the mean by $n$, but in population standard deviation, you divide by $n - 1$

Honestly, why not just divide by $n$? I am really not understanding the logic of this.

When I google for an explanation, I find the same mechanical difference.

Kindly explain.

2

There are 2 best solutions below

0
On

Generally, the "variance" is the variance of a known/theoretical distribution. This is usually denoted as $\sigma^2$. When speaking of a sample, usually the distribution that a particular parameter follows is unknown or hard to access. So it is generally estimated, and is denoted as $s^2$. This $s^2$ has $n-1$ since this leads to an unbiased estimator of the true variance $\sigma^2$. Thus, the respective standard deviations are $\sigma$ and $s$. However, $s$ is no longer an unbiased estimator of $\sigma$. Read more here.

0
On

You have this backwards: It is in the sample variance that you divide by $n-1$. If $n$ is the size of the whole population then the population variance is the average of the squares of the deviations, and that is the sum of those squares divided by $n$.

Division by $n-1$, if done at all, should be done ONLY when using the sample to ESTIMATE the variance of the whole population.

This section of a Wikipedia article explains why it is done.

Whether it ought to be done, i.e. whether estimates ought to be unbiased, is debatable.

You may have read that if $X_1,\ldots,X_n$ are independent random variables then $$\operatorname{var}(X_1+\cdots+X_n) = \operatorname{var}(X_1) + \cdots + \operatorname{var}(X_n). \tag 1$$ That works with the one where you divide by $n$, but not with the one where you divide by $n-1$. The identity $(1)$ is the reason why standard deviation rather than mean absolute deviation or some other measure of dispersion is used.