Simple Linear Regression with Dummy Variable: Testing Gender Differences

44 Views Asked by At

If we have data on sample who report their wage $W$ and gender $D$ (w/ $D=1$ male): $$ \ln W_{i}=\beta_{1}+\beta_{2} D_{i}+\varepsilon_{i} $$ The OLS estimators of the regression coefficients are $$ \begin{array}{c}\hat{\beta}_{1}=\frac{\sum_{i=1}^{n}\left(1-D_{i}\right) \ln W_{i}}{\sum_{i=1}^{n}\left(1-D_{i}\right)}=\ln \bar{W}_{f} \\ \hat{\beta}_{2}=\frac{\sum_{i=1}^{n} D_{i} \ln W_{i}}{\sum_{i=1}^{n} D_{i}}-\frac{\sum_{i=1}^{n}\left(1-D_{i}\right) \ln W_{i}}{\sum_{i=1}^{n}\left(1-D_{i}\right)}=\ln \bar{W}_{m}-\ln \bar{W}_{f}\end{array} $$ with $$ \begin{array}{l}\bar{W}_{f}=\prod_{i=1}^{n} W_{i}^{\frac{1-D_{i}}{n_{f}}} \\ \bar{W}_{m}=\prod_{i=1}^{n} W_{i}^{\frac{D_{i}}{n_{m}}}\end{array} $$

Where did the expression for $\hat{\beta}_{1}$ and $\hat{\beta}_{2}$ come from? I'm familiar with the standard formulas: $\begin{array}{c}\hat{\beta}_{1}=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}} \\ \hat{\beta}_{0}=\bar{y}-\hat{\beta_{1}} \bar{x}\end{array}$

1

There are 1 best solutions below

0
On

Check out OLS estimates in ANOVA model. In a nutshell, if your model is $$ y_i = \beta_0 + \beta_1D_i + \epsilon_i, $$ where $D_i$ is a dummy variable such that $D_i = 1$ if the $i$th observation belongs to group $1$, and $D_i = 0$ if the $i$th observation belongs to group $0$, then for the reference group (where $D_i=0$) the OLS of $\beta_0$ is the sample mean of $Y$ in group $0$, $\bar{y}_0$, and the OLS of $\beta_1$ is the difference between the sample mean of group $1$ and group $0$, i.e., $\bar{y}_1 - \bar{y}_0$. Replace $y_i$ with $\ln (w_i)$, and you'll get the desirable results.