Let's consider the following problem : We want to predict a variable $y$ and we have two categorical variables : $A$ that can take 3 different values and $B$ than can take 2 different values.
A regression model with interaction would be :
$$y = \sum_{k=1}^{3}\alpha_k\mathbb{1}_{A_k}+\sum_{k=1}^{2}\beta_k\mathbb{1}_{B_k}+\sum_{i,j}\gamma_{ij}\mathbb{1}_{A_iB_j}$$
Another equivalent formulation :
$$y = \mu +\sum_{k=1}^{3}\hat{\alpha_k}\mathbb{1}_{A_k}+\sum_{k=1}^{2}\hat{\beta_k}\mathbb{1}_{B_k}+\sum_{i,j}\hat{\gamma_{ij}}\mathbb{1}_{A_iB_j}$$
In the second formulation, do we have a "meaning" for $\mu$ ? Is it the mean of $y$ across all categories ? Is there any advantage of using one formula over the other in this setting ?
Also, if we consider's R output, which doesn't keep the first category for each variable, how would we interpret its intercept coefficient ?
2026-03-25 20:41:14.1774471274
Linear regression model with 2 categorical variables
208 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
The issue with your formulations is that different coefficients would give the same answers. For example in the first expression you could add a constant $c$ to all the $\alpha_k$ and subtract the same $c$ from all the $\beta_k$ and get the same $y$, or in the second add $c$ to $\mu$ and subtract the same $c$ from all the $\hat{\beta}_k$
To give you an answer, R has used this property to set $\hat{\alpha}_1$ and $\hat{\beta}_1$ and all the $\hat{\gamma}_{1n}$ and $\hat{\gamma}_{m1}$ to zero, and this can allow a unique solution
In this case, the intercept $\mu$ is the intercept (or since you have no numerical independent variables, the predicted value) when the independent variables correspond to the first values of the factors. As an illustration, consider this toy example:
to give
The intercept of $2$ corresponds to the prediction when $a$ is "F" and $b$ is "S". It is in fact the average of the first two $y$ values in the dataframe of $1$ and $3$, as you might intuitively expect. Then for example
bTvalue of $3$ to the intercept to get $5$ you have the third value in the dataframe, the prediction when $a$ is "F" and $b$ is "T"aGvalue of $1$ to the intercept to get $3$ you have the fourth value in the dataframe, the prediction when $a$ is "G" and $b$ is "S"aG,bTandaG:bTvalues to the intercept to get $7$ you have the average of the fifth and sixth values in the dataframe, the prediction when $a$ is "G" and $b$ is "T"