For a pupil, i, selected at random from a school, the number of years of education of their parents, $X_i$, is given by: $$ X_{i}=\mu+\varepsilon_{i} $$ $\varepsilon_{i} \sim i i d\left(0, \sigma^{2}\right)$. Here $\mu$ is the mean number of years of education completed by parents. For a sample of N students selected independently from the population:
(e) Is the sample mean BLUE? Either way, prove it.
Answer: First part of proof proves conditions for linear estimator to be unbiased. The second part proves that if the estimator is unbiased, the variance of a linear estimator cannot better it.
Define linear estimator $\tilde{X}=\frac{1}{N} \sum_{i=1}^{N} w_{i} X_{i}$ with weights made up: $w_{i}=1+\delta_{i}$. The 1 here is what the sample mean weight are, so we are saying our new estimator weights differ from that of the sample mean by the amount $\delta_{i}$.
$$
\mathbb{E}(\tilde{X})=\mathbb{E}\left(\frac{1}{N} \sum_{i=1}^{N}\left(1+\delta_{i}\right) X_{i}\right)=\mu+\frac{\mu}{N} \sum_{i=1}^{N} \delta_{i}
$$
Hence we must have $\sum_{i=1}^{N} \delta_{i}=0$ for our new linear estimator to be unbiased. Now we derive variance:
$$
\operatorname{Var}(\tilde{X})=\frac{1}{N^{2}} \sum_{i=1}^{N}\left(1+\delta_{i}\right)^{2} \sigma^{2}=\frac{\sigma^{2}}{N}+\frac{\sigma^{2}}{N^{2}} \sum_{i=1}^{N}\left(2 \delta_{i}+\delta_{i}^{2}\right)=\operatorname{Var}(\bar{X})+\frac{\sigma^{2}}{N^{2}} \sum_{i=1}^{N} \delta_{i}^{2}
$$
Finally, note that for non-zero weights the expression $\sum_{i=1}^{N} \delta_{i}^{2}=\eta>0$, hence we have that the variance of the new estimator is greater than that of the sample mean. Hence this has proved that any other linear estimator apart from the sample mean has a greater sampling variance.
$$
\operatorname{Var}(\tilde{X})=\operatorname{Var}(\bar{X})+\eta
$$
Q: The expected value expression makes sense, but where does the variance expression come from and how does he get that?