Why do we calculate variance if standard deviation serves the ends well?

676 Views Asked by At

I don't understand why we even care for square units. How does it make sense that if you take the squared difference of each data set and mean, then divided it by $n-1$ gives us a measure of spread? Variance is not as intuitive to me as standard deviation, which makes absolute sense. Can anyone help me understand the importance of variance and how its formula makes sense?

3

There are 3 best solutions below

0
On

My opinion :

A naive approach for measuring the spread around the mean would be $\mathbb{E}[|X-\mathbb{E}[X]|]$, but absolute value have bad properties for different reasons and I guess thats why we prefer to use the square instead of absolute value. The square function is a smoother function and therefore have better properties for optimization etc..., which lead to $\mathbb{E}[(X-\mathbb{E}[X])^2]$. However due to square it will change the magnitude of the values involved and to rescale it we put a square root, leading to standard deviation.

0
On

Standard deviation is the square root of variance, so they're measuring the same thing. The standard deviation approximates a typical element's distance from the mean (as you pointed out), so it's useful for visually understanding a distribution, whereas the variance has more convenient algebraic properties and tends to show up directly in probability theorems more often. Luckily, it's easy to convert between the two.

0
On

A couple of reasons involving statistical practice.

(1) For a random sample of size $n$ from a normal population where $\mu$ and $\sigma^2$ are both unknown, one has the relationship $$\frac{(n-1)S^2}{\sigma^2}\sim\mathsf{Chisq}(\nu= n-1),$$ which can be used to make a CI for $\sigma^2$ and to test hypothesis involving $\sigma^2.$ [To find a CI for $\sigma,$ take square roots of endpoints of a CI for $\sigma^2.]$

Using R, we get the CI $(40.04,\, 70.12)$ for $\sigma^2$ from a sample of size $n = 100$ from a normal population known to have $\sigma^2 = 49.$

set.seed(131)
x = rnorm(100, 50, 7)   # norm samp size 100. var 49
CI = 99*var(x)/qchisq(c(.975,.025), 99)
CI
[1] 40.05443 70.11714   # 95% CI for pop variance
sqrt(CI)
[1] 6.328857 8.373598   # 95% CI for pop SD

(2) $E(S^2) = \sigma^2,$ but for normal data $E(S) < \sigma.$ The bias of $S$ as an estimate of $\sigma$ is especially noticeable for small $n.$

The following simulation in R, using a million samples of size $n=4,$ illustrates the bias of the sample standard deviation $S.$ (Any one normal sample of size 4 can have an unusually large or small standard deviation. However, by looking at a million samples, it becomes clear that the bias is toward values of $S$ that underestimate $\sigma.)$

 set.seed(2022)
 n = 4;  sg = 15
 s = replicate(10^6, sd(rnorm(n, 100, sg)))
 mean(s)
 [1] 13.81432  # noticeably smaller than 15
 mean(s^2)
 [1] 224.8197  # approx = pop variance 225