Consider the number of passengers that, in 2009, used a given airport: $$D=[D_1, D_2, \ldots, D_m]^T$$ where $D_i$ represents the number of passengers that, in 2009, used the airport number $i$. I can access these values as part of my historical records.
Now, I want to see how these values depend on some other variables, which are:
- Variable $x_1^{(i)}$ is the population of the city where airport $i$ is located.
- Variable $x_2^{(i)}$ is the number of hotels within 100km from airport $i$.
- Variable $x_3^{(i)}$ is the GDP (Gross Domestic Product) of the city where airport $i$ is located.
I have all those data, so they are given for each airport in 2009.
So I adjust a linear model, like this: $$\log{D_i}=\beta_0+\beta_1\log{x_1^{(i)}}+\beta_2\log{x_2^{(i)}}+\beta_3\log{x_3^{(i)}}$$ where, again, $i=1,2,\ldots,m$ (m is the total number of airports I have in my database). I successfully adjusted this model using MATLAB (but it can also be done by hand), and got the following values for our 4 parameters $\{\beta_0, \beta_1, \beta_2, \beta_3\}$:
- $\beta_0=-2$
- $\beta_1=0.15$
- $\beta_2=-0.1$
- $\beta_3=0.20$
Now I'm trying to understand the meaning of these values. More specifically, their sign. For example, $\beta_3=0.20>0$, so our linear model says that the greater the GDP in airport $i$ the more passengers its airport will experience in year 2009, which seems pretty logical to me.
I don't agree that $\beta_2<0$. Shouldn't air traffic increase in airport $i$ as the number of hotels increase?
And what I totally don't understand is the sign of $\beta_0$, which I can't seem to relate with the rest of the variables. Why is $\beta_0$ negative and what does this mean, qualitatively speaking?
Where $x_1 = x_2 = x_3 = 1$ than $\log D = \beta_0$ so $D=e^{\beta_0}$. I.e., you can view $\beta_0$ (intercept) as a baseline number of ($\log$) airports where all the other variables are $0$. So, in your case it leads to a less then one airport. Whether it is logical or not (as the negative sign of $\beta_2$) is a socioeconomic question rather than mathematical or statistical. From a technical point of view, biased estimators may occur in a case of model missspecification, particularity when you omit valuable explanatory variables. Therefore, if the signs of the estimators are wrong from a theoretical perspective, then you can try to collect more variables or/and estimate another model.