Mean of residuals in OLS is 0

889 Views Asked by At

Assume $Y=X\beta_0+\epsilon$ where $\epsilon$ is zero mean and $X$ is fixed. I know that under certain conditions on the design matrix $X$ in OLS, the sample mean of the residuals $\bar{e}$ is $0.$ Can we say the same for the true population mean of residuals as well?

$EY=X\beta_0$, $\hat{Y}=X\hat{\beta}$ and $$E[e]=E(Y-\hat{Y})=E[X\beta_0-X\hat{\beta}]=X(\beta_0-E(\hat{\beta}))=0$$ since $\hat{\beta}$ is an unbiased estimator of $\beta_0$.

I don’t know why no references mention this fact. Am I making a mistake?

1

There are 1 best solutions below

3
On BEST ANSWER

If we are treating $X$ as fixed, which I take to mean non-random, then by assumption

$$E[\epsilon] = 0$$

You are sort of going in a circle because you say $E[Y] = X \beta_0$, but this is because of our prior assumption. It is a result not an assumption (as far as I have seen). We have

\begin{align} Y &= X \beta_0 + \epsilon\\ E[Y] &= E[X \beta_0 + \epsilon]\\ E[Y] &= E[X \beta_0] + E[\epsilon]\\ E[Y] &= X \beta_0 + 0\\ E[Y] &= X \beta_0 \end{align}

where the fourth line is a result of our assumption and $X$ being non-random (as well as $\beta_0$ being a constant).

Edit:

I am not conflating them, however I did not answer your question. I was a bit confused by your wording of the proposition.

Yes, the expectation of the residuals will be 0. Your proof is fine, except $Y = X \beta_0 + \epsilon$. The $\epsilon$ will go away in expectation, so it does not affect the result.

The fact that the sample mean of the residuals is 0 is an algebraic property, often a sufficient result of a column of $X$ being constant (and in which case the stronger result is that the sample sum is 0). The expectation of the residuals being 0 is a statistical property and holds true in any case given $\beta_0$ exists and is unbiased.