Including month and year dummies in an OLS regression

500 Views Asked by At

Let’s say we would like to predict the sales of a company (Y), with using the size of the company and two month dummies. We only use one company over a period of 20 years.

${\rm sales}_t=\ \propto+\beta_1{\rm size}_t\ +\ \beta_2{\rm January}_t+\beta_3{\rm October}_t+\varepsilon_t$

1st question: Is it allowed to add an 2001 dummy, when there are already January and October dummies included in the regression equation?
If it is allowed and considering significance, how would the 2001 dummy be interpretated:

  1. For observations of the year 2001, the predicted sales are $\beta_4$ units higher/lower than in the other years, everything else constant.
  2. For observations of the year 2001 and not into January and October, the predicted sales are $\beta_4$ units higher/lower than in the other years, everything else constant.

2nd question: In case it is allowed to include the year dummy. Let’s say we want to know how the effect of size on sales is changing in the year 2001. To test this, an interaction term is created. ${\rm sales}_t=\ \propto+\beta_1{\rm size}_t\ +\ \beta_2{\rm January}_t+\beta_3{\rm October}_t{\ +\ \beta}_4{2001}_t{\ +\ \beta}_5{2001*size}_t+\varepsilon_t$

Which of the interpretations of the interaction term would be correct (if $\beta_1 {\ \beta}_5$ are significant):

  1. For every additional unit of size, the predicted sales are increasing/decreasing by $\beta_1$ units, but in 2001, the sales are increasing/decreasing by $\beta_1+{\ \beta}_5$ units, everything else constant.
  2. For every additional unit of size, the predicted sales are increasing/decreasing by $\beta_1$ units, but in 2001 and not in January or October, the sales are increasing/decreasing by $\beta_1+{\ \beta}_5$ units, everything else constant.

I am unsure if it is allowed to add the year dummy and if it is, then I am unsure if I need the "and not into January and October" when interpreting it. I couldn't find answers for that in books and in that forum.

Please let me know, when you have suggestions how I can improve my question.

Thank you!

1

There are 1 best solutions below

4
On BEST ANSWER

1st question: Another graph is indicating, that size as well as sales are both highly varying in 2008 (financial crisis). Is it allowed to add an 2008 dummy, when there are already January and October dummies included in the regression equation?

Yes, no problem with that, since an observation that belongs to 2008 may or may not be in January or October. You are not "allowed" adding dummies whenever it creates a complete multicollinearity. E.g., adding dummy variable for each one of the $12$ months.

For the second question, you don't need the "not in January or October" part, since when you state that everything else hold fixed it includes both when the observation is, e.g., from January (i.e., January $= 1$), and when the observation is not from January (i.e., January $= 0$)$. In both cases the marginal effect of the other variable is the same.