When can a matrix product be simplified by multiplying inverse matrices to get identity matrix; example (OLS estimator)

92 Views Asked by At

In the derivation of the OLS estimator, having started with the two equations $E(u_t)=0$ and $E(u_tX_t)=0$ and having written these in matrix form, we obtain:

$$ \boldsymbol{X^T} \boldsymbol{X} \boldsymbol{\beta} = \boldsymbol{X^T} \boldsymbol{y}$$

Premultiplying both sides by $ (\boldsymbol{X^T} \boldsymbol{X})^{-1} $, assuming this inverse exists, we obtain:

$$ (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{X} \boldsymbol{\beta} = (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{y}$$

Flushing out the inverse of a product on the left side, we get:

$$ \boldsymbol{X^{-1}} \boldsymbol{(X^T)^{-1}}\boldsymbol{X^T} \boldsymbol{X} \boldsymbol{\beta} = (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{y}$$

Then we collapse the product of a matrix with its inverse to get the identity matrix two times on the left side.

$$ \boldsymbol{X^{-1}} \boldsymbol{I} \boldsymbol{X} \boldsymbol{\beta} = (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{y}$$

$$ \boldsymbol{I} \boldsymbol{\beta} = (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{y}$$

$$ \boldsymbol{\beta} = (\boldsymbol{X^T} \boldsymbol{X})^{-1}\boldsymbol{X^T} \boldsymbol{y}$$

On the right hand side, we do not apply the same rule we applied to the left side (flushing out the inverse of a product) although the right hand side could be written as $ \boldsymbol{X^{-1}} (\boldsymbol{X^T})^{-1}\boldsymbol{X^T} \boldsymbol{y}$ and the two terms in the middle appear to be collapsible into the identity matrix, leaving us with $ \boldsymbol{X^{-1}}\boldsymbol{y}$, which however appear to not be comformable to multiplication. I am guessing this is the reason this last step is never done, but I am not sure and would like to understand precisely what rule of linear algebra I am breaking in doing this last step.

1

There are 1 best solutions below

2
On BEST ANSWER

The issue is the following one. If you suppose that $X^TX$ is invertible, then indeed

$$(X^TX)^{-1}(X^TX) =I$$ by definition of the inverse of $X^TX$. But not because $(X^TX)^{-1}= X^{-1}(X^T)^{-1}$: this equality even doesn’t make any sense if $X$ is not a square matrix!

This is why you’re left to keep $(X^T X)^{-1}$ on the right side of the equality.