OLS estimator through matrix operations

102 Views Asked by At

The OLS estimator is

$$\hat\beta = (X'X)^{-1}X'Y$$

And I don't know why we have to even think about minimizing squares of residuals and partial derivatives (e.g.,: https://web.stanford.edu/~mrosenfe/soc_meth_proj3/matrix_OLS_NYU_notes.pdf) if we could just do

$$ X\hat\beta = Y $$ $$ X'X\hat\beta = X'Y $$ $$ \hat\beta = (X'X)^{-1}X'Y$$

what am I missing?

1

There are 1 best solutions below

0
On

Your derivation is misleading. Initially, either $X\beta = E[Y]$ or $X\hat \beta = \hat Y$. The orthogonal decomposition that allows $Y = \hat Y + e $ stems from minimizing the square residuals. $$ X \hat \beta = X( Y + e) = XY + Xe = XY, $$ it stems from the fact that the residuals $e$ are orthogonal to the space spanned by $X$. Although now you can proceed with your derivation, but it is based on the decomposition that is obtained from the residual minimization in the first place.