Consider:
- unknown random variable $\Theta$
- observed random variable $X$
The best Least Mean Square estimate $\hat{\Theta}$ you can get for $\Theta$ under the Mean Squared Error Criteria $\mathbb{E}[(\hat{\Theta} - \Theta)^2|X]$ is $$\hat{\Theta}=\mathbb{E}[\Theta|X]$$ This corresponds to the best possible error of $\mathbb{E}[(\hat{\Theta} - \Theta)^2|X]=Var(\Theta|X)$. However, the best Linear Least Mean Square estimate is $$\hat{\Theta}_L= \mathbb{E}[\Theta] + \rho\frac{\sigma_{\Theta}}{\sigma_{X}}(X-\mathbb{E}[X])$$ which results in error $\mathbb{E}[(\hat{\Theta}_L - \Theta)^2]=(1-\rho^2)Var(\Theta)$, where $\rho$ is the correlation coffecient between $X$ and $\Theta$.
This doesn't make sense to me because $\hat{\Theta}=h(X)=\mathbb{E}[\Theta|X]$ is the optimal estimator for the Mean Squared Error, but $\hat{\Theta}=f(X)=\mathbb{E}[\Theta] + \rho\frac{\sigma_{\Theta}}{\sigma_{X}}(X-\mathbb{E}[X])$ can go down to $0$ when $|\rho|=1$, meaning that the Linear Least Means Square estimator can be better than the regular Least Means Square? I'm very confused.
The only explanation I can of is that $Var(\Theta|X)=0$ when $|\rho|=1$. That still confused me because then that means $Var(\Theta) = Var(\Theta|X) = 0$?