I quarantined myself to my home and currently self-studying from Casella-Berger's Statistical Inference.
In Section 2.2, Example 2.2.6 (Minimizing distance), they want to show that Expectation can be used as a good guess for a value of X, a random variable. So let b a constant.
$(X-b)^2$ measures the distance between b and X. The closer they are the smaller the number. $E(X-b)^2$ gives a good estimation for that distance.
Then they want to minimize this value without using calculus. The proof follows as:
$E(X-b)^2 = E(X-EX+EX-b)^2=E((X-EX)+(EX-b))^2=$ $E(X-EX)^2+(EX-b)^2+2E((X-EX)(EX-b))$
Then we get $E(X-b)^2 = E(X-EX)^2+(EX-b)^2$ . However then they say that we don't have any control over the first term on the second right-hand side. For the second term on the second right-hand side, it is always greater than or equal to 0 and can be made equal to 0 by choosing $b=EX$. Then we get: $min_bE(X-b)^2=E(X-EX)^2$
Okay, what I don't understand is, if we choose $b=EX$, then shouldn't the b on the left hand side should change too? Or are we just saying that since the minimum value of $(EX-b)^2$ is $0$, we just take it take since we are trying to minimize the distance.
You are of course right; $b$ varies in the same manner on both sides of the equation. That’s precisely what we want: we want to minimize $E(X-b)^2$, so we investigate how this value varies, and find that it varies as $E(X-EX)^2+(EX-b)^2$ and is thus minimal for $b=EX$ (at which point it takes the minimal value $E(X-EX)^2)$.