Why is variance defined as $\sum\limits_{n} |\mu -x_i|^2$ and not $\sum\limits_{n} |\mu -x_i|$?

117 Views Asked by At

If we wanted to measure how much the values $x_1, \ldots ,x_n$ of a sample differ from the mean $\mu$, it seems more intuitive to me to use the formula $$\frac{\sum\limits_{n} |\mu -x_i|}{n}$$ instead of the formula for variance. I've read about some geometric interpretations of variance as well as standard deviation, yet this just seems to push the questions further back, as we could ask what reason do we have to care more about the distance between the vectors $(x_1,\ldots x_n)$ and $(\mu ,\ldots ,\mu)$ as opposed to just the average distance between a possible value $x_0$ and $\mu$.

Some explanations of the variance formula point to the fact that variance pays more attention to values further apart from the mean, but two immediate questions come to mind: Why should we give more importance to values farther apart from the mean? And why should we do so by squaring the respective distances instead of, say, cubing them?

1

There are 1 best solutions below

0
On

Although both are measures of dispersion, the use of one over the other often boils down to statistical inference as well as level of difficulty in solving decision problems.

  1. If one uses $g(a)=E|X-a|$ to infer a parameter of a random variable $X$, then $g$ is minimized by the median; where as if one uses $h(a)=E[(X-a)^2]$, the minimizer is attained at the mean.
  2. Computationally, $h$ is easier to use in optimization problems as one can use differentiation methods. $g$ is more complicated in optimization problems.

With computer power nowadays, one can handle both to obtained average estimates as well as median estimates.