Why is the total sum of squares equals to explained sum of squares + residual sum of squares?

467 Views Asked by At

Given set of sample points $(x_i,y_i)$ in the 2-dimentional space, Let $(x_i,\hat{y_i})$ be the corresponding points on the regression line. And let $\bar{y}$ be the average of $y$.

I want to know why $\sum_{i=1}^n (y_i-\bar{y})^2=\sum_{i=1}^n (y_i-\hat{y_i})^2+\sum_{i=1}^n (\hat{y_i}-\bar{y})^2$?

1

There are 1 best solutions below

3
On BEST ANSWER

It's the Pythagorean theorem. There is the linear space $M$ of all possible model predictions, there is the actual observed data point $p$ which typically is not in $M$, and there is the fitted model $m\in M$ that minimizes $\|p-m\|^2$. The normal equations say, in effect, that the error $p-m$ is perpendicular to $M$, so the triangle with vertices $m$, $p$, and $0$ is a right triangle, with $m$ being where the right angle is. Pythagoras says $\|p-0\|^2=\|p-m\|^2+\|m-0\|^2$. The statistician's traditional names for $\|p-0\|^2$, for $\|p-m\|^2$ and for $\|m-0\|^2$ are the total sum of squares, the residual sum of squares, and the explained sum of squares.

In your regression setup, $M$ is the subspace of $\mathbb R^n$ spanned by the vectors of all $1$s and the vector $m=(x_1,x_2,\ldots,x_n)'$, and $m$ would be the LS estimate $(\hat y_1,\ldots,\hat y_n)'$. Each putative regression formula $y=a+bx$ gives rise to a vector of predictions for the $y$ variables, namely $(a+bx_1,a+bx_2,\ldots,a+bx_n)'$; as you vary $a$ and $b$, these prediction vectors trace out all of $M$. The job of LS fitting is to find the $a$ and $b$ values that make these predictions come as close to the actual obverved vector $p=(y_1,\ldots,y_n)'$ as possible.

It's been eons since I've read it, but I think this is all spelled out in Draper and Smith's Applied Regression Analysis. ((Added 4 years too late: Math Reviews MR0020239 tells me that Kolmogorov, A. N. ``On the proof of the method of least squares'', in the Russian journal Uspehi Matem. Nauk (N. S.) 1(11), (1946) says pretty much the same thing.))

Skipping the geometry: it falls out of the normal equations.