Conditional mean squared error vs unconditional mean squared error

551 Views Asked by At

Suppose you are trying to predict $Y$ from a set of predictors $X$. When you considered $E[Y-\hat{Y}]^2$ or $E[(Y-\hat{Y})^2|X]$, the minimizer is $E[Y|X]$.

As $E[Y|X]$ is hard to compute we can rely on a linear predictor $\tilde{Y} = a + b[X-E[X]]$.

To find the values of $a$ and $b$, we usually minimize $E[Y - \tilde{Y}]^2$ not $E[(Y-\tilde{Y})^2|X]$. I get that if we minimize the latter it results a function of $E[Y|X]$. Is it the only reason to minimize the former over latter in the case of linear predictors?