Why don't we just take the root of the numerator instead of taking the root of the whole thing in the expression for standard deviation?

166 Views Asked by At

The equation of the standard deviation of a dataset is given by $\sqrt{\frac{\sum{(x_{i} - \bar{x}})^2}{N}}$. Why is that the case and why can't we use $\frac{\sqrt{\sum{(x_{i} - \bar{x}})^2}}{N}$ instead? The units line up and we don't have to worry about negatives in this case too. Thanks a million in advance!

2

There are 2 best solutions below

2
On

The formula for the (sample) standard deviation that you give is derived from the formula for the variance. Let's first look at the definition of the variance. The variance of a random variable $X$ is defined as the expected squared deviation from the mean: $$Var(X) = \mathbb{E}\left[(X-\mathbb{E}[X])^2\right].$$ Since this is an expectation (that is, a mean), it makes sense to estimate this by its sample version: $$\hat{v} = \frac{1}{N}\sum_{i=1}^N (x_i -\bar{x})^2,$$ that is, the sample average of the squared deviations from the sample average.

Now take a look at the definition of the standard deviation. The standard deviation is defined as $$std(X) = \sqrt{Var(X)}.$$ Hence, it makes sense to estimate it by the square root of our estimator for the variance: $$s = \sqrt{\hat{v}}.$$

That is, $$s = \sqrt{\frac{1}{N}\sum_{i=1}^N (x_i -\bar{x})^2}.$$

3
On

I suspect that you feel like the $1/N$ should be pulled out because then the equation would "look like an average" (i.e. we divide by $N$). But note that we did take an average when we consider the variance: the variance is just the average of the squared deviations (from the mean). You take each value's deviation from the mean, square this deviation, then take the average of all of these squared deviations. This gives you the variance. But note that when we squared each deviation, we also "squared the units". To get back to the original units, we take the square root. This gives you the standard deviation.