Frisch-Waugh Theorem: Partitioned Regression

590 Views Asked by At

I'd like to have a better understanding of the following lecture notes. I believe it is similar to, if not precisely, the Frisch-Waugh Theorem.

Problem 5: Prove the partial regression formula by writing the normal equations in partitioned form and solve the $\hat\beta_2$ as a function of $\hat\beta_1$. Substitute this solution and solve for $\hat\beta_1$. Show that the partial regression formula still holds if we replace $y^*$ by $y$, i.e. if we do not 'purge' the dependent variable. This last sentence is what I'm trying to understand.

The normal equations, then put into partitioned form: $$ X^{\prime} X \hat{\beta}=X^{\prime} y $$

$$ \left[\begin{array}{cc}X_{1}^{\prime} X_{1} & X_{1}^{\prime} X_{2} \\ X_{2}^{\prime} X_{1} & X_{2}^{\prime} X_{2}\end{array}\right]\left[\begin{array}{c}\hat{\beta}_{1} \\ \hat{\beta}_{2}\end{array}\right]=\left[\begin{array}{l}X_{1}^{\prime} y \\ X_{2}^{\prime} y\end{array}\right] $$ or $$ \begin{array}{l}X_{1}^{\prime} X_{1} \hat{\beta}_{1}+X_{1}^{\prime} X_{2} \beta_{2}=X_{1}^{\prime} y \\ X_{2}^{\prime} X_{1} \hat{\beta}_{1}+X_{2}^{\prime} X_{2} \beta_{2}=X_{2}^{\prime} y\end{array} $$ From the second equation $$ \hat{\beta}_{2}=\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime}\left(y-X_{1} \hat{\beta}_{1}\right) $$ Substitution in the first equation gives $$ X_{1}^{\prime} X_{1} \hat{\beta}_{1}+X_{1}^{\prime} X_{2}\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime}\left(y-X_{1} \hat{\beta}_{1}\right)=X^{\prime} y_{1} $$ Collecting terms we have $$ X_{1}^{\prime}\left(I-X_{2}\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime}\right) X_{1} \hat{\beta}_{1}=X_{1}^{\prime}\left(I-X_{2}\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime}\right) y $$ Define $M_{2} \equiv I-X_{2}\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime}$

Hence $$ \hat{\beta}_{1}=\left(X_{1}^{\prime} M_{2} X_{1}\right)^{-1} X_{1}^{\prime} M_{2} y $$ Further define $y^{*} \equiv M_{2} y=y-X_{2} \hat{\beta}_{2}^{*}$ with $\hat{\beta}_{2}^{*} \equiv\left(X_{2}^{\prime} X_{2}\right)^{-1} X_{2}^{\prime} y$

In the same way define $X_{1}^{*} \equiv M_{2} X_{1}$ so that $$ \hat{\beta}_{1}=\left(X_{1}^{*^{\prime}} X_{1}^{*}\right)^{-1} X_{1}^{*^{\prime}} y^{*} $$

I'm able to follow all of it, but having trouble conceptually understanding the last four expressions with $y^*,\hat\beta_2^*,X_1^*$.

$y^*$ seems to just be the orthogonal errors/components of the first regression. And $\hat{\beta}_{2}^{*}$ looks like the standard formula for any $\hat\beta_2$. Why is it getting an asterisk? And lastly, $X_{1}^{*}$ has the $X_2$ components "purged". I don't understand the conceptual significance as a whole. The two separate derivations of $\hat\beta_1$ seem exactly the same?

1

There are 1 best solutions below

0
On BEST ANSWER

You're right. "The two separate derivations of $\hat{\beta}_1$" are the same. Since $M_2$ is symmetric and idempotent, \begin{align*} \hat{\beta}_{1}&=\left([M_{2}{\bf{X}}_{1}]^{\top}[M_{2} {\bf{X}}_{1}]\right)^{-1} [M_{2}{\bf{X}}_{1}]^{\top}{\bf{y}} \\ &=\left([M_{2}{\bf{X}}_{1}]^{\top}[M_{2} {\bf{X}}_{1}]\right)^{-1} [M_{2}{\bf{X}}_{1}]^{\top}[{M_2\bf{y}}], \end{align*} which means that in order to obtain the OLS estimator of $\beta_1$ in $$ y_i=X_{1i}^\top{\beta_1}+X_{2i}^\top{\beta_2}+\varepsilon_i,\tag{1}\label{1} $$ it suffices to regress $y_i$ and $X_{1i}$ on $X_{2i}$ and then regress the resulting residuals one on another. Alternatively, you may regress $X_{1i}$ on $X_{2i}$ and use the residuals and the dependent variable to get $\hat{\beta}_1$. In addition, $\hat{\beta}_2^*$ is not an estimator of $\beta_2$ in $\eqref{1}$, i.e., $\hat{\beta}_2^*\ne \hat{\beta}_2$.