Bias-Variance Tradeoff Derivation

44 Views Asked by At

I was looking at this derivation of the bias-variance tradeoff. In the last block of the derivation it is claimed that

$\text{E}[f\hat{f}] + \text{E}[\varepsilon\hat{f}] = \text{E}[f\hat{f}] + \text{E}[\varepsilon]\text{E}[\hat{f}]$

since $\hat{f}$ and $\varepsilon$ are independent. Can someone explain why they are independent? This question has been asked before, and as one comment points out, it doesn't make a lot of sense intuitively:

Intuitively, $\varepsilon$ is the irreducible error due to e.g. measurement errors, which is independent from the model estimates $\hat{f}$. However, in practice, the level of noise in training data determines the data quality, which in turn impose huge impact on getting $\hat{f}$.

Thanks a lot!