Does $E(\sum e_i^2) = \sum E(y_i^2) - E(\sum \hat y_i^2)$ hold true?

269 Views Asked by At

This was posted as a practice proof for a regressions class. I've worked through it from the perspective of $SSE = SST - SSR$, but I cannot reduce to the given equation. There were other mistakes made in this practice homework, so it's possible this problem is missing something, but I don't want to pass off my inability to prove the equation on the practice problem's design.

We know that $E(\sum e_i^2) = E\sum (y_i - \hat y_i)^2$, but when factoring out the values, this side does not equal the right side of the equation. I thought to start from $SSE = SST - SSR$ and reduce that initial setup—$E\sum (y_i - \hat y_i)^2 = E\sum (y_i - \bar y)^2 - E\sum (\hat y_i - \bar y)^2$—but again I was unable to reduce the given values down to the initial equation given in the title. As I reduce it, there are leftover $2y_i \hat y_i$ and $2\hat y_i \bar y_i$, which I can't remove. I appreciate any insight into what I'm missing (or what the initial question is missing).

3

There are 3 best solutions below

0
On BEST ANSWER

In general it holds for simple and multi linear regression. Also, the expectations are not required.

Well writing it in matrix form \begin{equation} Y = X\beta +\epsilon \end{equation} where $X$ is the $X \times 2$ matrix, then \begin{equation} \sum e_i^2 = e^T e \end{equation} where \begin{equation} e = Y - \hat{Y} \end{equation} and \begin{equation} \hat{Y} = (X^T X)^{-1}X^T Y \tag{1} \end{equation} So \begin{equation} \sum e_i^2 = (Y - \hat{Y})^T(Y - \hat{Y}) = Y^TY - 2\hat{Y}^TY + \hat{Y}^T\hat{Y} \end{equation} It's easy to see that $\sum_i y_i = Y^TY$ and $\hat{Y}^T\hat{Y} = \sum_i \hat{y}_i$, but \begin{equation} \sum \hat{y}_iy_i = \hat{Y}^TY= ((X^TX)^{-1}X^TY)^T Y = Y^T X(X^TX)^{-1}\hat{Y} \tag{2} \end{equation} But equation $(1)$ tells us that $X^T Y = (X^T X)\hat{Y}$ or \begin{equation} Y^T X = \hat{Y}^T(X^T X) \tag{3} \end{equation} Replacing $(3)$ in $(2)$ you get \begin{equation} \sum_i \hat{y}_iy_i =\hat{Y}^T(X^T X) (X^TX)^{-1}\hat{Y} =\hat{Y}^T \hat{Y} = \sum_i \hat{y}_i^2 \tag{4} \end{equation} Now, expanding $\sum e_i^2$, we get \begin{equation} \sum_i e_i^2 = \sum y_i^2 - 2 \sum y_i\hat{y}_i + \sum \hat{y}_i^2 \tag{5} \end{equation} using equation $(4)$ in $(5)$, we get \begin{equation} \sum_i e_i^2 = \sum y_i^2 - 2 \sum \hat{y}_i^2 + \sum \hat{y}_i^2 \end{equation} that is

\begin{equation} \sum_i e_i^2 = \sum y_i^2 - \sum \hat{y}_i^2 \end{equation}

1
On

You are close. I presume that from $SSE = SST - SSR$ you have $$\sum_i e_i^2 = \sum_i (y_i - \bar{y})^2 - \sum_i (\hat{y}_i - \bar{y})^2 = \sum_i \hat{y}_i^2 - \sum_i y_i^2 - 2 \sum_i (y_i - \hat{y}_i) \bar{y},$$ and are wondering what to do with the extra term.

It turns out the last term is zero; you can prove this by looking at the "normal equations" (i.e. look at the derivation of the least squares coefficients) or by laboriously plugging in the definition of $\hat{y}_i$.

Note that the result holds even without the expectations.

[By the way, why do you have an index $i$ in $\bar{y}_i$? Isn't $\bar{y} := \frac{1}{n} \sum_i y_i$?]

0
On

To prove $$E(\sum e_i^2) = \sum E(y_i^2) - E(\sum \hat y_i^2)$$ Just need to prove $$\sum (y_i-\hat y_i)^2=\sum y_i^2-\sum\hat y_i^2$$ $$\sum y_i^2+\sum\hat y_i^2-2\sum y_i\hat y_i=\sum y_i^2-\sum\hat y_i^2$$ $$2\sum\hat y_i^2-2\sum y_i\hat y_i=0$$ Need to prove $$\sum(y_i-\hat y_i)\hat y_i=0$$ $$\sum(y_i-\beta_0-\beta_1x_i)(\beta_0+\beta_1x_i)=0$$ $$\beta_0\sum(y_i-\beta_0-\beta_1x_i)+\beta_1\sum(y_i-\beta_0-\beta_1x_i)x_i=0$$ In Least squares regression, the sum of the squares of the errors is minimized. $$ SSE=\displaystyle\sum\limits\left(e_i \right)^2= \sum\left(y_i - \hat{y_i} \right)^2= \sum\left(y_i -\beta_0- \beta_1x_i\right)^2 $$ Take the partial derivative of SSE with respect to $\beta_0$ and setting it to zero. $$ \frac{\partial{SSE}}{\partial{\beta_0}} = \sum2\left(y_i - \beta_0 - \beta_1x_i\right)^1 (-1) = 0 $$ So $$ \sum\left(y_i - \beta_0 - \beta_1x_i\right)^1 (-1) = 0 $$ Take the partial derivative of SSE with respect to $\beta_1$ and setting it to zero. $$ \frac{\partial{SSE}}{\partial{\beta_1}} = \sum2\left(y_i - \beta_0 - \beta_1x_i\right)^1 (-x_i) = 0 $$ So $$ \sum \left(y_i - \beta_0 - \beta_1x_i\right)^1 x_i = 0 $$ Hence $$\beta_0\sum(y_i-\beta_0-\beta_1x_i)+\beta_1\sum(y_i-\beta_0-\beta_1x_i)x_i=0$$ $$E(\sum e_i^2) = \sum E(y_i^2) - E(\sum \hat y_i^2)$$