The unbiased estimator of the variance of $\widehat{\beta}_1$ in simple linear regression

662 Views Asked by At

Given the simple linear regression model $Y = \beta_0 + \beta_1x + U$, where $U\sim N(0,\sigma^2)$, I know how to derive \begin{equation} \text{Var}[\widehat{\beta}_1] = \dfrac{\sigma^2}{\sum_{i=1}^n(x_i - \overline{x})^2} \end{equation}

However, I do not know how to prove that \begin{equation} S^2 := \dfrac{\frac{1}{n-2}\sum_{i=1}^n {\widehat{U}_i}^2}{\sum_{i=1}^n(x_i - \overline{x})^2} \end{equation} is the unbiased estimator of the variance of $\widehat{\beta}_1$, that is, $E[S^2] = \text{Var}[\widehat{\beta}_1]$.

I started \begin{align} E[S^2] &= E\left[ \dfrac{\frac{1}{n-2}\sum_{i=1}^n {\widehat{U}_i}^2}{\sum_{i=1}^n(x_i - \overline{x})^2} \right]\\[1ex] &= \frac{1}{n-2} \cdot \frac{1}{\sum_{i=1}^n(x_i - \overline{x})^2} E\left[ \sum_{i=1}^n {\widehat{U}_i}^2\right] \end{align} where we notice that we only have to show that $\displaystyle E\left[ \sum_{i=1}^n {\widehat{U}_i}^2\right] = \sigma^2(n-2)$. Thus, I started \begin{align} E\left[\sum_{i=1}^n {\widehat{U}_i}^2 \right] &= \sum_{i=1}^n E[{\widehat{U}_i}^2]\\[1ex] &= \sum_{i=1}^n E\left[(Y_i - \widehat{\beta}_0 - \widehat{\beta}_1x_i)^2\right]\\[1ex] &= \sum_{i=1}^n E\left[{Y_i}^2 + {\widehat{\beta}_0}^2 + {\widehat{\beta}_1}^2{x_i}^2 - 2\widehat{\beta}_0Y_i - 2\widehat{\beta}_1x_iY_i + 2\widehat{\beta}_0\widehat{\beta}_1x_i \right]\\[1ex] &= \sum_{i=1}^n \left(E[{Y_i}^2] + E[{\widehat{\beta}_0}^2] + {x_i}^2E[{\widehat{\beta}_1}^2] - 2E[\widehat{\beta}_0Y_i] - 2x_iE[\widehat{\beta}_1Y_i] + 2x_iE[\widehat{\beta}_0\widehat{\beta}_1]\right) \end{align} but I am not sure how to continue. Could you kindly show, preferably without matrix albegra?

EDIT: I think I made some progress. Knowing that $E[XY] = \text{cov}(X,Y) + E[X]E[Y]$ and that $\text{cov}(X,X) = \text{Var}(X)$ gets us going. In addition, we need to know that \begin{align*} Y_i & \sim N\left(\beta_0 + \beta_1x_i, \sigma^2\right)\\ \widehat{\beta}_0 & \sim N\left(\beta_0, \frac{\sum {x_i}^2}{n\sum(x_i - \overline{x})^2}\sigma^2\right)\\ \widehat{\beta}_1 & \sim N\left(\beta_1, \frac{1}{\sum(x_i - \overline{x})^2} \sigma^2 \right) \end{align*}

From these facts we get \begin{align} E[{Y_i}^2] &= \text{cov}(Y_i,Y_i) + E[Y_i]E[Y_i]\\ &= \text{Var}(Y_i) + E[Y_i]^2\\ &= \sigma^2 + (\beta_0 + \beta_1x_i)^2 \end{align} Similarly \begin{align} E[\widehat{\beta}_0^2] &= \text{Var}(\widehat{\beta}_0) + E[\widehat{\beta}_0]^2\\ &= \frac{\sum {x_i}^2}{n\sum(x_i - \overline{x})^2}\sigma^2 + {\beta_0}^2 \end{align} and \begin{align} E[\widehat{\beta}_1^2] &= \text{Var}(\widehat{\beta}_1) + E[\widehat{\beta}_1]^2\\ &= \frac{1}{\sum(x_i - \overline{x})^2}\sigma^2 + {\beta_1}^2 \end{align}

To continue, I know need to know how to handle the last three expected values. Any help on that?

1

There are 1 best solutions below

4
On

You have a typo: the numerator of $S^2$ should be $\frac{1}{n-2}\sum_{i=1}^n\hat{U}_i^2$ where $\hat{U}_i$ is the OLS residual of the $i$-th observation.

Let $X=(\iota\; x)$ be the $n\times 2$ matrix of regressors ($\iota$ is the $n\times 1$ column of $1$'s). Let $P=X(X'X)^{-1}X'$ and $M=I_n-P$. Then with $\hat{U}=(\hat{U}_1,\ldots,\hat{U}_n)'$ $$ \sum_{i=1}^n\hat{U}_i^2=\hat{U}'\hat{U}=(MY)'(MY)=(MU)'MU=U'MU. $$ Recall that $M$ is both symmetric and idempotent. Then, using the facts that a number equals its trace and that trace and expectation commute, we have \begin{align*} E\left(\sum_{i=1}^n\hat{U}_i^2\right)&=E[U'MU]=E[\text{trace}(U'MU)]=E[\text{trace}(MUU')]\\ &=\text{trace}[E(MUU')]=\text{trace}[ME(UU')]\\ &=\text{trace}[M\sigma^2I_n]=\sigma^2\text{trace}[M]. \end{align*} To find $\text{trace}[M]$, we note \begin{align*} \text{trace}[M]&=\text{trace}[I_n-X(X'X)^{-1}X']=\text{trace}(I_n)-\text{trace}[X(X'X)^{-1}X']\\ &=\text{trace}(I_n)-\text{trace}[X'X(X'X)^{-1}]=\text{trace}(I_n)-\text{trace}(I_2)=n-2. \end{align*} It remains to put these results together. I leave that to you.