From Probability and Statistics in Engineering by Hines et. al:
Let $y_i = \beta_0 + \beta _1 x_i + \epsilon$, where $\epsilon$ has mean $0$ and variance $\sigma^2$ with all $\epsilon_i$ uncorrelated.
Let $SS_E = \sum y_i^2 - n\bar y^2 - \hat \beta_1 S_{xy}$, where $S_{xy} = \sum x_iy_i - \frac{1}{n}(\sum x_i^2)(\sum y_i^2)$ and $\hat \beta_1 = \frac{S_{xy}}{S_{xx}}$.
$E[\hat \beta _1] = \beta_1$ and $V(\hat \beta_1) = \frac{\sigma^2}{S_{xx}}$.
Then $E(SS_E) = (n-2) \sigma^2$
How is this derived? I can't figure out a way to show this.
$E[\sum y_i^2] = n\sigma^2 + (E[y_i])^2$
$E[\hat \beta_1 S_{xy}] = E[S_{xx}\hat \beta_1 ^2] = S_{xx}(\frac{\sigma^2}{S_{xx}} + \beta_1) = \sigma^2 + S_{xx} \beta_1$
$E[n \bar y^2] = nE[\bar y^2] = n (\frac{\sigma^2}{n} +(\frac{1}{n}\sum E[y_i])^2 )$
But I can't see a way to show equality with these.
It is better to follow the lucid method provided by @RCL under the general setup. However, you can still find the result by doing some simple calculations. Here $SS_E$ is called Residual Sum of Squares(RSS). It is defined as the sum of squares of residuals (difference of observed and predicted value). Suppose $\hat{y_{i}}$ is predicted value obtained by the linear model. Then \begin{equation} \begin{aligned} SS_E&=\sum_{i=1}^{n}(y_i-\hat{y_{i}})^2\\ &=\sum_{i=1}^{n}(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)^2\\ &=\sum_{i=1}^{n}(y_i-\overline{y}+\hat{\beta_1}\overline{x}-\hat{\beta_1}x_i)^2\\ &=\sum_{i=1}^{n}((y_i-\overline{y})-\hat{\beta_1}(x_i-\overline{x}))^2\\ &=\sum_{i=1}^{n} (y_i-\overline{y})^2+\hat{\beta_1}^2\sum_{i=1}^{n} (x_i-\overline{x})^2-2\hat{\beta_1}\sum_{i=1}^{n}(x_i-\overline{x})(y_i-\overline{y})\\ &=\sum_{i=1}^{n} (y_i-\overline{y})^2+\hat{\beta_1}^2 S_{xx}-2\hat{\beta_1}^2S_{xx}\\ &=\sum_{i=1}^{n} (y_i-\overline{y})^2-\hat{\beta_1}^2 S_{xx} \end{aligned} \end{equation} Thus $E(SS_E)=E(\sum_{i=1}^{n} (y_i-\overline{y})^2)-E(\hat{\beta_1}^2 S_{xx})$. Let us focus on $E(\sum_{i=1}^{n} (y_i-\overline{y})^2)$. It is not a good idea to break the square in the initial stage. It is efficient if we replace $y_i=\beta_0+\beta_1x_i+\epsilon_i$. Because $\epsilon_i$'s are better to deal as they have zero mean!
Denote $\overline{\epsilon}=\frac{1}{n}\sum_{i=1}^{n}\epsilon_i$
Now, \begin{equation} \begin{aligned} \sum_{i=1}^{n} (y_i-\overline{y})^2&=\sum_{i=1}^{n} (\beta_0+\beta_1x_i+\epsilon_i-\beta_0-\beta_1\overline{x}-\overline{\epsilon})^2\\ &=\sum_{i=1}^{n}(\beta_1(x_i-\overline{x})+(\epsilon_i-\overline{\epsilon}))^2\\ &=\sum_{i=1}^{n} \beta_1^2(x_i-\overline{x})^2+\sum_{i=1}^{n} (\epsilon_i-\overline{\epsilon})^2+2\beta_1 \sum_{i=1}^{n} (x_i-\overline{x})(\epsilon_i-\overline{\epsilon}) \end{aligned} \end{equation} So, $E(\sum_{i=1}^{n} (y_i-\overline{y})^2)^2=\sum_{i=1}^{n} \beta_1^2(x_i-\overline{x})^2+\sum_{i=1}^{n}E(\epsilon_i-\overline{\epsilon})^2+2\beta_1 \sum_{i=1}^{n} (x_i-\overline{x})E(\epsilon_i-\overline{\epsilon})$. Clearly, the last term is $0$.
And, $E(\epsilon_i-\overline{\epsilon})^2=E(\sum_{i=1}^{n}\epsilon_i^2-n\overline{\epsilon}^2)=\sum_{i=1}^{n}E(\epsilon_i^2)-\frac{1}{n}E(\sum_{i=1}^{n} \epsilon_i^2 +\sum_{i \not = j} \epsilon_i \epsilon_j)=n \sigma^2 -\sigma^2$.
Finally, $E(SS_E)=\beta_{1}^2 S_{xx} +(n-1)\sigma^2-\sigma^2- \beta_1^2 S_{xx}=(n-2)\sigma^2$.
(I have skipped some steps in the last part. You should check them. It is not difficult to realize those steps if you use the results efficiently)