So we are loosening $\mathrm{E}(\varepsilon \mid X)=0$ to $\mathrm{E}\left(\varepsilon_{i} x_{i}\right)=0$, i.e. $\varepsilon_{i}$ and $x_{i}$ are uncorrelated. The regression model for the $i$th observation: $$ y_{i}=\beta_{0}+x_{i}^{\prime} \beta_{1}+\varepsilon_{i} $$ with $x_{i}=\left(x_{i 2} \cdots x_{i K}\right)^{\prime}$. And errors are i.i.d with mean zero and variance $ \sigma^2$. All x's also i.i.d.
By partial regression the OLS estimator of $B_1$
$$ \begin{array}{c}\hat{\beta}_{1 n}=\left(X^{\prime} M_{1} X\right)^{-1} X^{\prime} M_{1} y= \\ =\left(\sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right)\left(x_{i}-\bar{x}_{n}\right)^{\prime}\right)^{-1}\left(\sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right) y_{i}\right)\end{array} $$ with $M_{1}=I-\frac{1}{n} \iota \iota^{\prime}$ and $\bar{x}_{n}$ the vector of sample means of the non-constant independent variables. Here we use subscript n to indicate that the OLS estimator uses a sample of size n.
Q: I'm confused about 'partial regression' here. I do not understand what he's doing with $M_1$. I thought M was usually the orthogonal complement of the projection matrix P. Not sure if it's the same here? Why is M showing up the middle of the formula ($(X'X)^{-1} X'y$)? I'll include the rest of the derivation below, but I'm still stuck with the above before I can follow the rest. I'm rather lost with this.
Upon substitution of $y_{i}=\beta_{0}+x_{i}^{\prime} \beta_{1}+\varepsilon_{i}$ we have $$ \hat{\beta}_{1 n}-\beta_{1}=\left(\sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right)\left(x_{i}-\bar{x}_{n}\right)^{\prime}\right)^{-1}\left(\sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right) \varepsilon_{i}\right) $$ Now $$ \frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right)\left(x_{i}-\bar{x}_{n}\right)^{\prime}=\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\mu_{x}\right)\left(x_{i}-\mu_{x}\right)^{\prime}-\left(\bar{x}_{n}-\mu_{x}\right)\left(\bar{x}_{n}-\mu_{x}\right)^{\prime} $$
Also $$ \frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}_{n}\right) \varepsilon_{i}=\frac{1}{n} \sum_{i=1}^{n} x_{i} \varepsilon_{i}-\bar{x}_{n} \frac{1}{n} \sum_{i=1}^{n} \varepsilon_{i} $$
with the first term on the rhs convering to 0 in probability by the weak law of large numbers.