Why the average of a set of value has the least square error?

117 Views Asked by At

Now we have the equation $$\sum_{i}(x_i-\hat x_i)^2,$$ where $x_i$ is the observed value of a data sample $S$. Here is the question:

Why does this expression get its minimum value when $\hat x_i$ is the average of the data sample $S$ ?

I tried to take the derivatives of that equation and make it to zero, but it seems there's something wrong, because $\hat x_i$ is kind of multi-variable. Can anyone help me out? Thanks a lot!

2

There are 2 best solutions below

1
On

Let's take the function:

$$f(\hat x)=\sum_{i=1}^n (x_i-\hat x)^2$$

Here, we want to find the value of $\hat x$ which minimizes $f(\hat x)$. Now, even though there are multiple variables of this function because of $x_i$, we can just treat these variables as constants since they are independent from the $\hat x$, which essentially changes this to a single-variable calculus problem. Now, let's take the derivative of $f$ with respect to $\hat x$.

$$f'(\hat x)=\sum_{i=1}^n 2(\hat x-x_i)$$

From here, can you find the value of $\hat x$ satisfying $f(\hat x)=0$? Once you solve that equation, use second-derivative test to show that it is indeed an absolute minimum.

0
On

This can be solved without calculus.

Let $f(z) =\sum_{i}(x_i-z)^2 $.

Then, since $\sum_{i}x_i =n\hat x$,

$\begin{array}\\ f(z)-f(\hat x) &=\sum_{i}(x_i-z)^2-\sum_{i}(x_i-\hat x)^2\\ &=\sum_{i}((x_i-z)^2-(x_i-\hat x)^2)\\ &=\sum_{i}(x_i^2-2x_iz+z^2-(x_i^2-2x_i\hat x+\hat x^2))\\ &=\sum_{i}(2x_i(\hat x-z)+z^2-\hat x^2)\\ &=2n\hat x(\hat x-z)+n(z^2-\hat x^2)\\ &=2n\hat x(\hat x-z)+n(z-\hat x)(z+\hat x)\\ &=n(\hat x-z)(2\hat x-(z+\hat x))\\ &=n(\hat x-z)(\hat x-z)\\ &=n(\hat x-z)^2\\ &\ge 0 \quad \text{with equality iff } z=\hat x\\ \end{array} $