Proof $E[\hat \sigma ^2] = E\left( \frac{1}{n-2} \Sigma(y_i-\hat{y_i})^2 \right) = \sigma ^2$: Linear Regression

1.9k Views Asked by At

I am trying to prove that the estimated variance of the residual

$$\hat \sigma ^2 = \frac{\Sigma(y_i-\hat{y_i})^2}{n-2}$$

is an unbiased estimator of the variance of the error $\sigma^2$.

So far what I know is that

$$\hat{y_i} = \hat \beta_0 + \hat \beta_1 x_i $$ and with help I was able to prove the property $$E[(y_i-\bar y)^2] = (n-1)\sigma^2+\beta_1^2 \Sigma(x_i-\bar x)^2$$ I also expanded the expression and played around with the $\Sigma E[y_i]$s and $\Sigma E[y_i^2]$s, but was not sure how to manipulate the $\Sigma x_i$s...

can I get some help, please?

2

There are 2 best solutions below

0
On

Note that $\{y_i\}_{i=1}^n$ are iid, where $y_i \sim N(\beta_0 + \beta_1x_i, \sigma^2)$, hence $\hat{y}_i = \hat{\beta_0} + {\beta_1}x_i$ is the estimator of $\mathbb{E}y_i$, thus $$ \sum_{i=1}^n\frac{(y_i - \hat{y}_i)^2}{\sigma^2} \sim \chi^2_{n-2}, $$ two degrees of freedom are "lost" due to estimation of $\beta_0$ and $\beta_1$. Thus, $$ \mathbb{E}\sum_{i=1}^n\frac{(y_i - \hat{y}_i)^2}{n-2} =\frac{\sigma^2}{n-2}\mathbb{E}\sum_{i=1}^n\frac{(y_i - \hat{y}_i)^2}{\sigma^2} = \frac{\sigma^2(n-2)}{n-2} = \sigma^2. $$

0
On

If you are familiar with the matrix notations in linear regression, let's denote $(y_1, \ldots, y_n)^T$ by $y$, $(\hat{y}_1, \ldots, \hat{y}_n)^T$ by $\hat{y}$, and the $n \times 2$ design matrix $\begin{bmatrix} 1 & x_{i} \end{bmatrix}$ by $X$. It is well-known that $\hat{y} = Hy$, where $H \equiv X(X^TX)^{-1}X^T$ is the "hat matrix".

Using the above notation, we have \begin{align} (n - 2)\hat{\sigma}^2 = (y - \hat{y})^T(y - \hat{y}) = y^T(I - H)y. \end{align}

Assuming the covariance matrix of the error $\varepsilon = y - X\beta$ is $E[\varepsilon\varepsilon^T] = \sigma^2 I_n$, we then have the following calculation: \begin{align} & E[(n - 2)\hat{\sigma}^2] \\ = & E[\text{tr}(y^T(I - H)y)] \\ = & E[\text{tr}((I - H)yy^T)] = \text{tr}(E[(I - H)yy^T]) \\ = & \text{tr}((I - H)E[yy^T]) \quad \text{(since tr$(AB) = $ tr$(BA)$)} \\ = & \text{tr}((I - H)E[(\varepsilon + X\beta)(\varepsilon + X\beta)^T]) \\ = & \text{tr}((I - H)E[\varepsilon\varepsilon^T] + (I - H)X\beta\beta^TX^T) \quad ((I - H)X = X - X = 0) \\ = & \sigma^2\text{tr}(I - H) \\ = & \sigma^2\text{tr}(I) - \sigma^2\text{tr}(H) \quad \text{(since $\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)$)} \\ = & (n - 2)\sigma^2. \quad \text{(since $\text{tr}(H) = \text{tr}(X(X^TX)^{-1}X^T) = \text{tr}(X^TX(X^TX)^{-1}) = \text{tr}(I_2) = 2$)} \end{align} This proves the claim.

More generally, if $X$ is of size $n \times p$ and of full column rank, then exactly the same argument shows that $E[\hat{\sigma}^2] = \sigma^2$.