Why is regularization used in linear regression?

152 Views Asked by At

I already understand that the point of regularization is to penalize (drive down) higher-order parameters for a model thereby increasing its generality. Outside of polynomial regression, I do not understand why regularization would be needed for linear models such as the Tikhonov regularization term in the analytical approach to linear regression:

$$\beta = (X^TX+\lambda I)^{-1}+(X^Ty) $$

Where $I$ refers to a design matrix of dimensions identical to $X$ and $\lambda \in \Re$.

From an intuitive standpoint, I do not understand why regularization is needed if the generality of the model is kept constant by the constraint on the order of the hypothesis (outside of ensuring invertibility). Thanks.

1

There are 1 best solutions below

0
On BEST ANSWER

Tikhonov is purely for invertibility, but things like LASSO/Ridge/Elastic-net are for when you want to pick explanatory variables but are worried about over-fitting.

If you are familiar with $R^2$, you know that adding another explanatory variable always increases the $R^2$ of the model. This leads to models that do very well in-sample but give very poor out-of-sample predictions. The LASSO, least-angle regression, random forests, etc. use similar methods to minimize expected (mean-squared) error. This means, you want to throw away explanatory variables that over-fit.

But this gets you back to regularization. Basic regularization is, you have more explanatory variables than observations. The over-fitting problem is, you have enough data to fit a linear model (i.e., solve $(X'X)\beta = X'y$) but you think the resulting model will be too sensitive. Similar tools can help you make good decisions about what explanatory variables are the most useful.