Why square the result of $x_1 - \bar{x}$ in the standard deviation?

Question

Why square the result of $x_1 - \bar{x}$ in the standard deviation?

1.2k Views Asked by Bumbble Comm At 31 Mar 2026 - 10:30

I don't understand the necessity of square the result of $x_1 - \bar{x}$ in $$\sqrt{\frac{\sum_{i=1}^{N} (x_i - \bar{x})^2}{N-1}}$$. In fact I don't understand even why is $N - 1$ on the denominator instead of just $N$. Someone could explain it or recommend a good text about it? All books about Errors Theory or even Statistics that I found are either too much abstract or too much simplist. Thanks in advance.

Original Q&A

There are 5 best solutions below

Bumbble Comm On 07 Mar 2014 - 6:36

It is necessary to square the deviations from the mean as you want to measure both positive as well as negative deviations (note that $\sum (x_i - \bar{x})$ is just zero). Another possibility is to take absolute values, but the above formula turns out to have nicer properties (such as additivity of variances, as pointed out by Arkamis).

Regarding the $N-1$ in the denominator: you would underestimate the standard deviation when dividing by $N$ since the true mean is not as close to $x_1$, $\ldots$, $x_n$ as the sample mean $\bar{x}$ is (in fact, $\bar{x}$ is calculated to be ''as close'' to the data points as possible). That the $N$ has to be replaced by $N-1$ can be derived by working out the expected value of your formula for the variance (the square of the SD). It turns out that the expected value equals the population variance, i.e. the sample variance is an unbiased estimator of the true variance.

Bumbble Comm On 07 Mar 2014 - 6:38

The square is used to remove the effect of the sign of $x_i - \overline{x}$. Suppose your mean was 0, and you had measurements at -2 and +2. These would cancel, but squaring gets rid of that issue.

Now, you might ask, "why not use absolute value?" Great question! The reason is that if we use absolute value, variances are no longer additive. In other words, with this definition, we have $\textrm{Var}(x_1 + x_2 + \cdots + x_m) = \textrm{Var}(x_1) + \textrm{Var}(x_2) + \cdots + \textrm{Var}(x_m)$.

As far as the $n-1$ term... it has to do with the fact that with $n$ data points, we get $n-1$ degrees of freedom. Dividing by $n-1$ rather than $n$ reduces bias.

Bumbble Comm On 07 Mar 2014 - 6:41

Hint: You can measure the spread by Minimum Absolute Deviation |(x-xbar)| or the squared differences. Take an example of the sequence, -3,0,3. Xbar for this is 0. Assume that you don't measure the Absolute deviation, then the spread is 0 if you just took x-xbar. The mean squared will avoid this situation and give you an objective measure of spread (to bring the unit of spread to the original measure, you take the square root of it. As far as dividing by N-1, N-1 is the number of observations minus the the degrees of freedom(measure of (number of estimators)). Here you are calculating the x-bar which is an estimator and hence subtract one.

Bumbble Comm On 07 Mar 2014 - 8:14

Squaring the Deviations

The variance of a sample measures the spread of the values in a sample or distribution. We could do this with any function of $|x_k-\bar{x}|$. The reason that we use $(x_k-\bar{x})^2$ is because the variance computed this way has very nice properties. Here are a couple:

$1$. The variance of the sum of independent variables is the sum of their variances.

Since $x_i$ and $y_j$ are independent, their probabilities multiply. Therefore, $$ \begin{align} \hspace{-1cm}\mathrm{Var}(X+Y) &=\sum_{i=1}^n\sum_{j=1}^m\Big[(x_i+y_j)-(\bar{x}+\bar{y})\Big]^2p_iq_j\\ &=\sum_{k=1}^n(x_i-\bar{x})^2p_i+\sum_{j=1}^m(y_j-\bar{y})^2q_j+2\sum_{i=1}^n(x_i-\bar{x})p_i\sum_{j=1}^m(y_j-\bar{y})q_j\\ &=\sum_{k=1}^n(x_i-\bar{x})^2p_i+\sum_{j=1}^m(y_j-\bar{y})^2q_j\\ &=\mathrm{Var}(X)+\mathrm{Var}(Y)\tag{1} \end{align} $$

$2$. The mean is the point from which the mean square variance is minimized: $$ \begin{align} \sum_{i=1}^n(x_i-a)^2p_i &=\sum_{i=1}^n(x_i^2-2ax_i+a^2)p_i\\ &=\sum_{i=1}^n\left(x_i^2-2\bar{x}x_i+\bar{x}^2+(\bar{x}-a)(2x_i-\bar{x}-a)\right)p_i\\ &=\left(\sum_{i=1}^n(x_i-\bar{x})^2p_i\right)+(\bar{x}-a)^2\tag{2} \end{align} $$ Dividing by $\mathbf{n-1}$

Considering $(2)$, it can be seen that the mean square of a sample measured from the mean of the sample will be smaller than the mean square of the sample measured from the mean of the distribution. In this answer, this idea is quantified to show that $$ \mathrm{E}[v_s]=\frac{n{-}1}{n}v_d\tag{3} $$ where $\mathrm{E}[v_s]$ is the expected value of the sample variance and $v_d$ is the distribution variance. $(3)$ explains why we estimate the distribution variance as $$ v_d=\frac1{n-1}\sum_{i=1}^n(x_i-\bar{x})^2\tag{4} $$ where $\bar{x}$ is the sample mean.

**Bumbble Comm** · Accepted Answer

Squaring $x_i-\bar x$:

If we didn't square it, we would just be adding up $x_i - \bar x$, and that will always give us zero. What we want instead is to total "how far" each $x_i$ is from $\bar x$.

So, we need to make sure we're taking the average of some positive quantity representing how far $x_i$ is from $\bar x$; one good choice is $(x_i - \bar x)^2$. Another example is $|x_i - \bar x|$, which leads to the average absolute deviation. It ends up that standard deviation tends to be "nicer" for most uses, though both results are a measurement of how "spread out" your data is.

Using $N-1$:

Using the $N-1$ instead of $N$ is called Bessel's correction; there are a few proofs on the Wikipedia page I've linked as to why you need the $N-1$ in order to get a better estimate of the population standard deviation.

Why square the result of $x_1 - \bar{x}$ in the standard deviation?

There are 5 best solutions below

Related Questions in STATISTICS

Related Questions in STANDARD-DEVIATION

Trending Questions

Popular # Hahtags

Popular Questions