Prove $SST=SSE+SSR$

50k Views Asked by At

Prove $$SST=SSE+SSR$$

I start with $$SST= \Sigma (y_i-\bar{y})^2=...=SSE+SSR+ \Sigma 2( y_i-y_i^*)(y_i^*-\bar{y} )$$ and I don't know how to prove that $\Sigma 2( y_i-y_i^*)(y_i^*-\bar{y} )=0$


a note on notation: the residuals $e_i$ is $e_i=y_i-y_i^*$. A more common notation is $\hat{y}$.

3

There are 3 best solutions below

5
On BEST ANSWER

The principle underlying least squares regression is that the sum of the squares of the errors is minimized. We can use calculus to find equations for the parameters $\beta_0$ and $\beta_1$ that minimize the sum of the squared errors.

Let $S = \displaystyle\sum\limits_{i=1}^n \left(e_i \right)^2= \sum \left(y_i - \hat{y_i} \right)^2= \sum \left(y_i - \beta_0 - \beta_1x_i\right)^2$

We want to find $\beta_0$ and $\beta_1$ that minimize the sum, $S$. We start by taking the partial derivative of $S$ with respect to $\beta_0$ and setting it to zero.

$$\frac{\partial{S}}{\partial{\beta_0}} = \sum 2\left(y_i - \beta_0 - \beta_1x_i\right)^1 (-1) = 0$$

notice that this says, $$\begin{align}\sum \left(y_i - \beta_0 - \beta_1x_i\right) &= 0 \\ \sum \left(y_i - \hat{y_i} \right) &= 0 \qquad (eqn. 1)\end{align}$$

Hence, the sum of the residuals is zero (as expected). Rearranging and solving for $\beta_0$ we arrive at, $$\begin{aligned}\sum \beta_0 &= \sum y_i -\beta_1 \sum x_i\\n\beta_0 &= \sum y_i -\beta_1 \sum x_i\\ \beta_0 &= \frac{1}{n}\sum y_i -\beta_1 \frac{1}{n}\sum x_i \end{aligned}$$

now taking the partial of $S$ with respect to $\beta_1$ and setting it to zero we have, $$\frac{\partial{S}}{\partial{\beta_1}} = \sum 2\left(y_i - \beta_0 - \beta_1x_i\right)^1 (-x_i) = 0$$

and dividing through by $-2$ and rearranging we have,

$$\sum x_i \left(y_i - \beta_0 - \beta_1x_i\right) = 0$$ $$\sum x_i \left(y_i - \hat{y_i} \right) = 0$$ but, again we know that $\hat{y_i} = \beta_0 + \beta_1x_i$. Thus, $x_i = \frac{1}{\beta_1}\left( \hat{y_i} - \beta_0 \right) = \frac1{\beta_1}\hat{y_i} -\frac{\beta_0}{\beta_1}$. Substituting this into the equation above gives the desired result.

$$\begin{aligned}\sum x_i \left(y_i - \hat{y_i} \right) &= 0\\\sum \left(\frac1{\beta_1}\hat{y_i} - \frac{\beta_0}{\beta_1}\right) \left(y_i - \hat{y_i} \right) &= 0\\\frac1{\beta_1}\sum \hat{y_i} \left(y_i - \hat{y_i} \right) - \frac{\beta_0}{\beta_1} \sum \left(y_i - \hat{y_i} \right)&= 0\end{aligned}$$

Now, the second term is zero (by eqn. 1) and so, we arrive immediately at the desired result: $$\sum \hat{y_i} \left(y_i - \hat{y_i} \right) = 0 \qquad (eqn. 2)$$

Now, let's use eqn. 1 and eqn. 2 to show that $\sum \left(\hat{y_i} - \bar{y} \right) \left( y_i - \hat{y_i} \right) = 0$ - which was your original question.

$$\sum \left(\hat{y_i} - \bar{y} \right) \left( y_i - \hat{y_i} \right) = \sum \hat{y_i} \left( y_i - \hat{y_i} \right) - \bar{y} \sum \left( y_i - \hat{y_i} \right) = 0$$

1
On

$$2\sum(y_i-y_i^*)(y_i^*-\bar{y})$$ $$= 2\sum[y_i(y_i^*-\bar{y})-y_i^*(y_i^*-\bar{y})]$$ $$= 2\sum Ye_i - 2\bar{Y}\sum e_i$$ $$= 0$$

0
On

If you have already found the formulas for $b_0$ and $b_1$, but you are having trouble proving that $\sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) = 0$, I think that the following proof is an interesting one:

\begin{aligned} \sum_{i=1}^n (y_i - \hat{y}_i)(\hat{y}_i - \bar{y}) &= \sum_{i=1}^{n}(y_i - \bar{y} -b_1 (x_i - \bar{x}))(\bar{y} + b_1 (x_i - \bar{x})-\bar{y}) \\ &= b_1 \sum_{i=1}^{n} (y_i - \bar{y})(x_i - \bar{x}) - b_1^2\sum_{i=1}^{n}(x_i - \bar{x})^2 \\ &= b_1 \frac{\sum_{i=1}^{n}(y_i -\bar{y})(x_i - \bar{x})}{\sum_{i=1}^{n}(x_i - \bar{x})^2} \sum_{i=1}^{n}(x_i - \bar{x})^2 - b_1^2\sum_{i=1}^{n}(x_i - \bar{x})^2 \\ &= b_1^2 \sum_{i=1}^{n}(x_i - \bar{x})^2 - b_1^2 \sum_{i=1}^{n}(x_i - \bar{x})^2 \\ &= 0 \end{aligned}