Prove that mean square error equals expected conditional variance

1.8k Views Asked by At

I'm a first year grad student in Statistics. The book I'm using mentioned conditional variance, and I wanted to read up more about it. I dove down the google rabbit hole and found this website. I read through it and followed the proofs. Then I came to this chunk, and I can't prove it myself.

From the definition of conditional variance and the basic property above, it follows that the mean square error when $E(Y∣X)$ is used as a predictor of $Y$ is:
$$E([Y−E(Y∣X)]^2 )=E[\operatorname{Var}(Y∣X)]=\operatorname{Var}(Y)−\operatorname{Var}[E(Y∣X)]$$

When I expand the LHS, I get the following:
$$\begin{split} E([Y−E(Y∣X)]^2) &= E([Y^2-2YE[Y|X]+E[Y|X]^2) \\ &= E[Y^2] - 2E[YE[Y|X]] + E[E[Y|X]^2] \\ &= E[Y^2] - 2E[Y^2] + \operatorname{Var}(E[Y|X]) + E[E[Y|X]]^2 \\ &= \operatorname{Var}(E[Y|X]) + E[Y]^2 - E[Y^2] \\ &= \operatorname{Var}(E[Y|X]) - \operatorname{Var}(Y) \end{split}$$ However, this is off by a factor of $-1$. Can anyone point out where I went awry?

1

There are 1 best solutions below

1
On

Up to $E[Y^2] - 2E[YE[Y|X]] + E[E[Y|X]^2]$ your computation is correct. I do not understand the line after this, though. It seems that you decided $E[YE[Y|X]]=E[Y^2]$? That's not right.

Let's write $Z=E[Y|X]$ to make formulas more digestible. By the definition of conditional expectation, $Z$ has the same integral as $Y$ on every set in the $\sigma$-algebra generated by $X$; as a consequence, $Y-Z$ integrates to zero in every such set. This implies $$E[(Y-Z) Z] = 0 \tag1$$ The property (1) makes a good deal of geometric sense if you think of $Z$ as the orthogonal projection of $Y$ onto some linear space of functions (those measurable with respect to a certain $\sigma$-algebra). The relation (1) says that the triangle formed by $Y$, $Z$, $0$ is right-angled, which is to be expected from orthogonal projection.

Using (1), we get $$ E[Y^2] - 2E[YZ] + E[Z^2] = E[Y^2] - 2E[Z^2] + E[Z^2] = E[Y^2]-E[Z^2] $$ And $E[Y^2]-E[Z^2]$ is equal to $\operatorname{Var}Y-\operatorname{Var}Z$, because $E[Y]=E[Z]$.