The interpretation of linear regression coefficients that I learned is that the coefficient is the change in outcome associated with a unit change in that covariate, assuming all other covariates stay the same. But if the other covariates can't stay the same, can I somehow control for that?
In this example, where the covariates we have available are functions of the population model constituents (X1 and X2), an increase in covar2 is associated with an increase in the outcome. But covar2 will have a negative coefficient because of its correlation with covar1.
n <- 10000
X1 = runif(n, 0, 1)
X2 = rnorm(n, 1, 0.5)
error <- rnorm(n, 0, 0.1)
covar1 = 0.8*X1 + 0.4*X2
covar2 = 0.75*X1
Y = X1 + X2 + error
summary(lm(Y ~ covar1 + covar2))
The coefficients are
covar1 2.50
covar2 -1.33
But I don't know how to derive meaning from them because their covariates can't vary independently. Can I say something like "controlling for covar1, a unit increase in covar2 is associated with an increase of <some number> in the outcome."? If so, how would I derive <some number>?
When you make inference in regression analysis regarding the effect of an independent variable on the dependent, you usually (always) presume ceteris paribus, namely, when any other covariate remains unchanged (i.e., an analogue to partial derivative in calculus) https://en.wikipedia.org/wiki/Ceteris_paribus#Economics. And "some number" is the coefficient, i.e., for the first covariate is $2.5$.