Good morning, I'm reading lecture slides bout the BLUE properties of OLS estimator.
- Conditional unbiasedness
- Conditional variance
My question:
I have two equalities from the two slides:
$$E\left(\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{\epsilon} | \boldsymbol{X}\right)=\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} E(\boldsymbol{\epsilon} | \boldsymbol{X})$$ and $$\operatorname{Var}\left(\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{\epsilon} | \boldsymbol{X}\right) =\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \operatorname{Var}(\boldsymbol{\epsilon} | \boldsymbol{X})\left(\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime}\right)^{\prime}$$
I understand that because we condition on $\boldsymbol{X}$, $\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime}$ is constant. Hence we can take it out of the expectation operator. I could not understand why we the take-out in the $\operatorname{Var}$ is different, it seems to me that we have a sandwich form $\operatorname{Var} (g(\boldsymbol{X}) \boldsymbol{\epsilon} | \boldsymbol{X}) = (g(\boldsymbol{X}) )\operatorname{Var} ( \boldsymbol{\epsilon} | \boldsymbol{X}) (g(\boldsymbol{X}) )^{\prime}$.
Could you please elaborate on this point?


You can prove the variance "formula" of some random vector $X$ and show that $$ Var(AX) = A Var(X) A^T, $$ where $A$ is a constant matrix.
Proof: Let $X$ be a random vector with $\mathbb{E} X= \mu$ and $Var(X) = \Sigma $, and $A$ some constant matrix. Variance (actually, a covariance) of some random vector $X$ is defined as $\mathbb{E} \left( [ X- \mathbb{E}[X]][ X- \mathbb{E}[X]] ^T\right)$, then
\begin{align} Var(AX) &= \mathbb{E}[ AX - A \mu ][ AX - A \mu] ^T \\ & = A\mathbb{E}[ X - \mu ][ X - \mu] ^T A^T \\ & = A Var(X) A^T. \end{align}