Explaining Why the Zero Conditional Mean Assumption is Important

9.8k Views Asked by At

I am currently relearning econometrics in more depth than I had before. One thing I am trying to make sense of currently is why it is necessary for the assumption of: $$E(u\mid x)=E(u) $$ to be true (where $u$ is the error term).

Here is how I have tried to reason through it, although I am not sure if this is a good reasoning on why.

Let's say $u$ is somehow correlated with some variable $y$, which $x$ is also correlated with. In this case, $$E(u\mid x) \not= E(u) $$ since for greater $x$ values the expectation of the error would go up or down since it is correlated with $x$ through the $y$ variable. With this being the case, the line of best fit would end up with greater or lesser expected errors as $x$ increases and decreases.

Is this what the zero conditional mean assumption is trying to say, or is there a better reasoning that I'm not hitting on?

Thank you!

3

There are 3 best solutions below

5
On

This assumption means that the error $u$ doesn't vary with $x$ in expectation. Often $\mathbb{E}u=0$, so this means that the error is always centered on your prediction.

This is weaker than independence, though, where $\mathbb{E} [f(u)|x]=\mathbb{E}[f(u)]$ for all (measurable) functions $f$.

In particular, if we take $f(u)=(u - \mathbb{E}[u|x])^2=(u-\mathbb{E}u)^2$ it is possible that $\mathbb{E}[f(u)|x] = \operatorname{Var}(u|x)$ can vary with time with this assumption. In a different word: heteroskedasticity.

0
On

Recall that you model the conditional expectation, hence if $\mathbb{E}[u|x]=g(x)$ $$ \mathbb{E}[y|x] = \mathbb{E} [a + b x + u|x]=a+bx+g(x), $$ then $g(x)$ is a part that you should model/approximate. If $g(x) = c$, i.e., a constant, then you can just add it to the intercept, i.e., $y=(a+c)+bx+\epsilon$ and $\mathbb{E}[\epsilon|x]=0$, otherwise you should impose explicit structure on $g(x)$. Hence, the assumption is $$ \mathbb{E}[u|x]=\mathbb{E}[u]=0, $$
means that given $x$, if you discard the disturbance $u$, you have a linear model in the parameters. Your main interest is $\mathbb{E}[u|x]$, as you look at the model given $x$ and not just at the error term itself.

0
On

The assumption $E(u\mid x)=0$ is a sufficient condition so that estimators like least squares are unbiased. In general such assumptions are made with an eye towards desirabe properties of the estimators.