How to calculate the “common slopes“ in multiple linear regression models including categorical variables

36 Views Asked by At

I have the following data x=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

y=[1.5,2.2,3.0,4.5,5.1,6.8,7.0,8.2,9.1,9.9,11.3,12.5,13.6,14.7,15.7]

group=['A','A','A','A','A','B','B','B','B','B','C','C','C','C','C']

Now build a multiple linear regression model including categorical variables $$ Y=\beta _{0}+\beta _{1}X_{1}+\beta _{2}X_{2}+\beta _{3}X_{3} $$ where $X_{1}$ represents the variable x in the data set

For categorical variables there are

Group $X_{2}$ $X_{3}$
A 0 0
B 1 0
C 0 1

The question is how do I calculate this 'common slope' $\beta_{1}$.Is there a mathematical formula

The relevant R code is as follows

df <- data.frame(x=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15),
                 y=c(1.5,2.2,3.0,4.5,5.1,6.8,7.0,8.2,9.1,9.9,11.3,12.5,13.6,14.7,15.7),
                 group=c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C'))
model <- lm(y~x+group,data = df)
summary(model)

The result of $\beta_{1}$ is 0.96

1

There are 1 best solutions below

0
On

Yes, there is a formula; see this website chapter 17 part 2.

Here is the R code performing the calculations:

x <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
y <- c(1.5,2.2,3.0,4.5,5.1,6.8,7.0,8.2,9.1,9.9,11.3,12.5,13.6,14.7,15.7)
group <- c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C')

indicesA <- which(group == "A")
SS_x_A <- sum(x[indicesA]^2) - sum(x[indicesA])^2/5
SC_wg_A <- sum(x[indicesA] * y[indicesA]) - sum(x[indicesA])*sum(y[indicesA])/5
indicesB <- which(group == "B")
SS_x_B <- sum(x[indicesB]^2) - sum(x[indicesB])^2/5
SC_wg_B <- sum(x[indicesB] * y[indicesB]) - sum(x[indicesB])*sum(y[indicesB])/5
indicesC <- which(group == "C")
SS_x_C <- sum(x[indicesC]^2) - sum(x[indicesC])^2/5
SC_wg_C <- sum(x[indicesC] * y[indicesC]) - sum(x[indicesC])*sum(y[indicesC])/5

SS_wg_X <- SS_x_A + SS_x_B + SS_x_C
SC_wg <- SC_wg_A + SC_wg_B + SC_wg_C

# estimate of the common slope:
SC_wg / SS_wg_X

lm(y ~ group + x)