In my textbook, the author said whether multicollinearity is a problem depends on the purpose of our regression model. If we are using regression to do prediction, then multicollinearity isn't a problem, as it doesn't impair the predictive power. I am very confounded here. Since if multicollinearity increases our standard error, it will also make our prediction interval larger. So shouldn't our predictive power become smaller since our prediction interval is wider than before? Or the word "predictive power" doesn't have anything to do with prediction interval?
Why multicollinearity doesn't affect predictive power of a regression model?
740 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
Multi-collinearity means that some variables in your data can be expressed as a combination of other variables. This means that these "dependent" variables really don't add any new information that will increase the prediction power.
For example, if you measure temperature, humidity and another variable which is an average of temperature and humidity, then the predictive power of your variables will not be diminished if you drop this third variable before applying your favorite prediction tool. In this sense, the variable doesn't contribute to increasing the predictive power -- in the same token, it doesn't decrease the predictive power either.
Some times, if the relationship between the variables is very complex (think 100s of variables) having these dependent variables around may actually help you as they may lead to a simpler and more robust model in the presence of noise.
You are right, however when you asses the "predictive power" of a model you use the pointwise predictions and not the intervals (i.e., cross-validation, MSE, RMSE etc., all based on the predicted values $\hat{y}_i$).
As multicolinearity does not violate any assumption and not biasing the predictions, the multicolinearity itself does not effect the average pointwise accuracy of the predictions, i.e., the so called "predictive power".
It does effect the stability of the model (the standard deviations that you have mentioned), however as you may be interested in prediction you don't care much about model's sensitivity to potential new inputs and or the "condition number" of the $X'X$ matrix.