Higher Variance due to Overfitting.

55 Views Asked by At

This is a homework question.

Consider the following two regression models $$\mathbf y=\mathbf X_1\mathbf {\beta_1} + \mathbf {\varepsilon}$$ $$\mathbf y=\mathbf X_1\mathbf {\beta_1} + \mathbf X_2\mathbf {\beta_2} + \mathbf {\varepsilon}$$

Where $\operatorname{Disp} (\mathbf {\varepsilon}) =\sigma^2\mathbf I$

Now, if the first model is correct, show that for any vector $\mathbf u$ one has

$$\operatorname{Var} \left( \mathbf {u'}\hat {\mathbf {\beta_1}}^{(2)}\right) \ge \operatorname{Var} \left( \mathbf {u'}\hat {\mathbf {\beta_1}}^{(1)}\right)$$

Where $\hat{\mathbf \beta_1}^{(1)}$ and $\hat {\mathbf \beta_1}^{(2)}$ are the Least-Square estimators of $\mathbf \beta_1$ from the first and second models respectively.

My attempt: The Least-Square (LS) estimators of $\mathbf \beta_1$ obtained from the first and second models are $$\hat {\mathbf {\beta_1}}^{(1)}=\mathbf {(X_1'X_1)^{-1}X_1'y}$$ $$\hat {\mathbf {\beta_1}}^{(2)}=\mathbf {(X_1'X_1)^{-1}X_1'}( \mathbf y - \mathbf {X_2\hat {\beta_2}^{(2)}})$$ Now since $\mathbf {\hat \beta_2^{(2)}}$ is the LS estimator of $\mathbf \beta_2$, it is a linear function of $\mathbf y$. So we may write $\mathbf {X_2\hat \beta_2^{(2)}}= \mathbf {Sy}$. Hence

$\operatorname{Var} \left( \mathbf {u'}\hat {\mathbf {\beta_1}}^{(2)}\right) - \operatorname{Var} \left( \mathbf {u'}\hat {\mathbf {\beta_1}}^{(1)}\right)$

$= \sigma^2\Bigl( \mathbf {u'\left(X_1'X_1\right)^{-1}X_1'\left(I - S\right)'\left(I - S\right)X_1\left(X_1'X_1\right)^{-1}u - u'\left(X_1'X_1\right)^{-1}X_1'X_1\left(X_1'X_1\right)^{-1}u}\Bigr)$

$= \sigma^2\Bigl(\mathbf {z'(I - S)'(I - S)z - z'z}\Bigr)$

$\Bigl($Where $\mathbf z = \mathbf {X_1\left(X_1'X_1\right)^{-1}u}$ $\Bigr)$

$= \sigma^2\Bigl(\mathbf {z'SS'z - z'S'z -z'Sz}\Bigr)$

Now this is where I am stuck. I know that the first term in the brackets is non-negative as the matrix $\mathbf {SS'}$ is non-negative definite, but the two negative terms are something that I am completely unable to comment on. Any help would be appreciated. Thanks.