Which method would you use to ascertain the most important independent variables?

24 Views Asked by At

I am trying to ascertain which independent variables matter the most as they pertain to the dependent variable.

The two methods I have used are giving slightly different answers. I have tried two: correlation matrix and scaled coefficients in a regression.

Needing clarification:

  1. The correlation matrix and regression coefficients are giving me slightly different X-variables that matter the most. Which method would you use?

For example:

the correlation matrix shows that crime has a strong negative correlation to income (Y), public transit has a strong positive correlation to income (Y), education has a strong positive correlation to income (Y), and population has a strong positive correlation to income (Y).

the scaled coefficients from the regression show that access to public transit has a strong positive relationship to income (Y), education has a strong positive relationship to income (Y), and access to tutors has a strong positive relationship to income (Y).

correlation with R corrr::correlate(data):

  • crime
  • transit
  • education
  • population

scaled coefficients in regression with R (lm(scale(y)) ~ scale(x1) + scale(x2) + scale(x3)...:

  • transit
  • education
  • tutors

Which would I use? And why? I believe the regression because it specifies an actual relationship. Or would you do something else?

  1. And, I thought that collinearity/high correlation was a problem in regressions which makes these two methods seem at odds with each other.

Thank you in advance for any clarification/guidance.