Prove this formula about residuals in case there is intercept in the OLS estimator

52 Views Asked by At

I'm learning OLS estimator with difficulty with computing the $R^2$. First are the notations used in my lecture note:

$X_{i} \equiv\left(\begin{array}{c}{X_{i 1}} \\ {X_{i 2}} \\ {\vdots} \\ {X_{i K}}\end{array}\right)$ of size $K \times 1$, $\beta \equiv\left(\begin{array}{c}{\beta_{1}} \\ {\beta_{2}} \\ {\vdots} \\ {\beta_{K}}\end{array}\right)$ of size $K \times 1$, and $\epsilon \equiv\left(\begin{array}{c}{\epsilon_{1}} \\ {\epsilon_{2}} \\ {\vdots} \\ {\epsilon_{n}}\end{array}\right)$ of size $n \times 1$.

$X \equiv\left(\begin{array}{cccc}{X_{11}} & {X_{12}} & {\cdots} & {X_{1 K}} \\ {X_{21}} & {X_{22}} & {\cdots} & {X_{2 K}} \\ {\vdots} & {\vdots} & {\vdots} & {\vdots} \\ {X_{n 1}} & {X_{n 2}} & {\cdots} & {X_{n K}}\end{array}\right)$ of size $n \times K$, and $Y \equiv\left(\begin{array}{c}{Y_{1}} \\ {Y_{2}} \\ {\vdots} \\ {Y_{n}}\end{array}\right)$ of size $n \times 1$

My model is $Y_{i}=X_{i}^{\prime} \beta+\epsilon_{i}$ for $i=1, \ldots, n$ or equivalently $Y=X \beta+\epsilon$. From FOC, we have $$\hat{\beta} =\left(X^{\prime} X\right)^{-1} X^{\prime} Y =\left(\sum_{i=1}^{n} X_{i} X_{i}^{\prime}\right)^{-1}\left(\sum_{i=1}^{n} X_{i} Y_{i}\right) =\left(\frac{1}{n} \sum_{i=1}^{n} X_{i} X_{i}^{\prime}\right)^{-1}\left(\frac{1}{n} \sum_{i=1}^{n} X_{i} Y_{i}\right)$$

Let $P_X = X\left(X^{\prime} X\right)^{-1} X^{\prime}$. Then the OLS fitted values $\hat{Y} \equiv X \hat{\beta}= P_XY$ and $\hat \epsilon \equiv Y - \hat Y$.

Then I have an exercise:

If there is intercept, i.e. $X_{i1} = 1$ for all $i=1,\ldots,n$, then $$\sum_{i=1}^{n}\left(Y_{i}-\overline{Y}\right)^{2} - \sum_{i=1}^{n} \hat{\epsilon}_{i}^{2}=\sum_{i=1}^{n}\left(\hat{Y}_{i}-\overline{Y}\right)^{2}$$ where $$\overline Y = \frac{1}{n} \sum_{i=1}^{n} Y_{i}$$


My attempt:

We have $$\begin{aligned} \sum_{i=1}^{n} (Y_{i}-\overline{Y})^{2} - \sum_{i=1}^{n} \hat{\epsilon}_{i}^{2} &= \sum_{i=1}^{n}(Y^2_{i}-2Y_i\overline{Y} +\overline{Y}^2) - \sum_{i=1}^{n} {(Y_i - \hat Y_i)}^{2}\\ &= \sum_{i=1}^{n} (Y^2_{i}-2Y_i\overline{Y} +\overline{Y}^2) - \sum_{i=1}^{n} (Y^{2}_i -2Y_i \hat Y_i+ \hat Y^2_i) \\&= \sum_{i=1}^{n} Y_i^2 -2\overline Y \sum_{i=1}^{n} Y_i +n \overline Y^2 - \sum_{i=1}^{n} Y_i^2 +2\sum_{i=1}^{n} Y_i \hat Y_i - \sum_{i=1}^{n} \hat Y_i^2 \\ &= -2 \overline Y n\overline Y + +n \overline Y^2 +2\sum_{i=1}^{n} Y_i \hat Y_i - \sum_{i=1}^{n} \hat Y_i^2 \\ &=-n\overline Y^2 +2\sum_{i=1}^{n} Y_i \hat Y_i - \sum_{i=1}^{n} \hat Y_i^2\end{aligned}$$


After that, I'm stuck at using the fact that the model has intercept. Could you please help me finish the proof? Thank you so much!

1

There are 1 best solutions below

8
On BEST ANSWER

You can start by showing the orthogonal decomposition, i.e., $$ \sum ( Y_i - \bar{Y})^2 = \sum(\hat{Y}_i - \bar{Y})^2 + \sum \hat{\epsilon}_i^2 $$ and then just rearrange the equation. So, start with \begin{align} \sum ( Y_i - \bar{Y})^2& = \sum ( Y_i - \hat{Y}_i + \hat{Y}_i - \bar{Y})^2\\ &= \sum ( \hat{Y}_i - \bar{Y})^2 + \sum ( \hat{Y}_i - Y_i)^2 + 2\sum(\hat{Y}_i - \bar{Y})(\hat{Y}_i-Y_i) \end{align} where $$ \sum(\hat{Y}_i - \bar{Y})(\hat{Y}_i-Y_i) = \sum (X_i'\hat{\beta}-\bar{Y})\hat{\epsilon}_i=\hat{\beta}'\sum X_i\hat{\epsilon}_i-\bar{Y}\sum\hat{\epsilon}_i=0-0=0. $$ where the zeroes come from the gradient of the loss function. Or algebraically due to the fact that the columns' space of $X$ is orthogonal to the errors $\hat{\epsilon}$, and $\sum \hat{\epsilon}_i = 0$ is the first term in the gradient of the loss function (i.e., partial derivative w.r.t. the intercept term $\beta_0$). Namely, the target function is $\arg\max \|Y-X\beta\|^2$, thus taking derivative w.r.t $\beta_0$ gives you $$ -2\sum (Y_i - \hat{\beta}_0 - \sum_{j=1}^k\hat{\beta}_j X_j) = -2 \sum\hat{\epsilon}_i = 0. $$