Linear regression proof that SST = SSR + SSE

1.8k Views Asked by At

enter image description here

My teacher wanted us to try to attempt to prove this. So I noticed the summation on the left represents SST (total sum of squares) and on the right I noticed the second summation was the measure in variability of the y's in the linear regression term. However what is that first summation? Also how can I manipulate the right side to get the left side?

1

There are 1 best solutions below

3
On BEST ANSWER

$\sum_{i=1}^n(y_i-\overline y)^2=\sum_{i=1}^n((\hat y_i-\overline y)+(y_i - \hat y_i))^2$

$=\sum_{i=1}^n((\hat y_i-\overline y)^2+2(\hat y_i-\overline y)(y_i-\hat y_i)+(y_i-\hat y_i)^2)$

$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\overline y)$

$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat\beta_0+\hat\beta_1x_{i1}+\hat\beta_2x_{i2}+...+\hat\beta_mx_{im}-\overline y)$

Now let $\hat u_i=y_i-\hat y_i$

$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n\hat u_i(\hat\beta_0+\hat\beta_1x_{i1}+\hat\beta_2x_{i2}+...+\hat\beta_mx_{im}-\overline y)$

$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2(\hat\beta_0-\overline y)\cdot \sum_{i=1}^n \hat u_i+2\hat\beta_1 \sum_{i=1}^n \hat u_ix_{i1}+2\hat\beta_2 \sum_{i=1}^n \hat u_ix_{i2}+...+2\hat\beta_m \sum_{i=1}^n \hat u_ix_{im}$

It is $\sum_{i=1}^n \hat u_i=0$ and $\sum_{i=1}^n \hat u_ix_{ij}=0 \ \ \forall j=1,2,...,m$

Finally it becomes

$\sum_{i=1}^n(y_i-\overline y)^2=\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2$