How to create a model with high multicollinearity

35 Views Asked by At

I am going to create a model strictly for predictive purposes. Some of my independent variables are highly correlated. When I try to create the regression with all of the variables together, then, I should not trust the p-values? Is it a better idea to check each independent variables separately against the dependent variable, just to make sure that they are significantly related.

1

There are 1 best solutions below

3
On

You should not use very highly correlated variables in your model, as the result of the regression will be meaningless (Imagine regressing Y on $X_1,...,X_n$ with $X_1=...=X_n$, you're obviously losing the unicity of the coefficient).

Try to perform a principal component analysis and drop the variables that have no explanatory power in sample.