Does adding polynomial terms in linear regression models not violate multicollinearity assumptions?

133 Views Asked by At

Linear regression has couple of assumptions. One of them is independence of feature vectors or they should not be correlated. If we add a polynomial features to feature vector, does it not create correlated feature?

Also, could anyone share good data set link for performing Regression splines/smoothing splines/piecewise polynomial regression?

1

There are 1 best solutions below

2
On

Usually, you don't assume independence of features, and not even that they are pairwise uncorrelated, rather that they are not too much correlated to the extent that your model is unstable.

Adding powers of some features may or may not cause problems. That is, $x^2$ and $x$ are not linearly dependent (i.e., correlated), hence no violation and no danger of perfect multicollinearity. However, clearly it does not imply zero correlation as well,i.e., $cov (X, X^2)$ maybe non zero. Namely, you may add noise (variance) to your model and harm the stability of the regression coefficients if the $x^2$ does not help to "explain" the conditional variablity of $Y$.