Why do we have unbiased estimator if the data are missing at random

42 Views Asked by Bumbble Comm At 31 Mar 2026 - 5:46

Suppose we are trying to model by linear regression $$ on $$. However, we only observe $(,)$ for $=1$. Assume that the true model is: $$Y = X'\beta + e$$ with $E(e|X) = 0$.

We know that $(|,)=(|)$, meaning that the missingness has nothing to do with the outcome. Why can we ignore the missing data in this case? Why will we have unbiased estimators?

I have seen many resources on econometrics claiming this without giving a proof.

Original Q&A

There are 1 best solutions below

user140541 On 06 Feb 2022 - 8:33 BEST ANSWER

Note that you can write $$ R_iY_i=R_iX_i^{\top}\beta+R_i\varepsilon_i, \quad i=1,\ldots, n. $$ The LS estimator of $\beta$ is then given by $$ \hat{\beta}=\beta+\left(\sum_{i=1}^n R_iX_iX_i^{\top}\right)^{-1}\sum_{i=1}^n R_iX_i\varepsilon_i. $$ If $\mathsf{E}[\varepsilon_i\mid R_i,X_i]=0$, the estimator is unbiased. This holds if, for example, $R_i$ is conditionally independent of $\varepsilon_i$ given $X_i$ (MAR assumption) and $\mathsf{E}[\varepsilon_i\mid X_i]=0$ (standard OLS assumption).

Why do we have unbiased estimator if the data are missing at random

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions