I am working on a predictive model for some real estate marketing business
y~f(X)
the problem is that my X has significant colinearity. This problem comes from the fact that a lot of raw features are 'naturally' correlated. For example, the average household income in a zip code is correlated with the education level in that zip code : people went to college tend to make more money.
I am thinking about removing the highly co-linear factors. However, I am afraid that by doing so I am going to throw away a lot of valuable information.
Ideally, is there any prediction (other than OLS) that works well with colinear factors? Is decision tree a good choice here?
Any insights here?
Thanks