Effects of the order of averaging and regressing

24 Views Asked by At

Suppose we have information (dependent variable: income, independent variables: age, gender, treatment (Boolean), etc) on individuals from different cities and we want to see the effect of treatment at the city level. My question is then: If I want to fit a linear model, should I

A. regress over all other factors at subject level, then average over subjects pertaining to each city

B. average over subjects in a city first, then fit the income ~ age + gender + other factors + treatment

What difference does it make if someone previously split the regression income ~ other factors then averaged over the subjects in each city to get a new income' with "other factors" regressed out, before fitting income' ~ age + gender + treatment? What if there are collinearities between regressors?

(There were approximately equal number of subjects from each city.)