In book Statistical Inference by Casella, the theorem 4.4.7 (Conditional variance identity), author proves one expectation to be 0 (shown in green rect).
However, I followed the same idea but got different result. I want to know what's wrong with my proof (below).
Thank you!
My derivation
Consider \begin{align*} E_X\big( \big[ X-E[X|Y] \big] \big[ E[X|Y]-EX \big] \big) \end{align*}
Note that the expectation is calculated under the distribution of $X$. Also note that $E[X|Y] = g(Y)$ and $EX=\text{constant}$. So we have \begin{align*} E_X\big( \big[ X-E[X|Y] \big] \big[ E[X|Y]-EX \big] \big) &= \big[ E[X|Y]-EX \big] E_X \big[ X-E[X|Y] \big] \\ &= \big[ E[X|Y]-EX \big] \big[ EX-E[X|Y] \big] \\ &= -\big[ E[X|Y]-EX \big]^2 \\ \end{align*}
I took $\big[ E[X|Y]-EX \big]$ out because it's a function of $Y$:
$$E_X[h(X) g(Y)] = \int_x [h(X) g(Y)] f_X(x) dx = g(Y) \int_x h(X) f_X(x) dx = g(Y) E_X[h(X)]$$

I am thinking this.
Step 1
Consider $\text{Var} X$, the author writes:
\begin{align*} \text{Var} X = E \big( \big[ X - EX \big]^2 \big) = E \big( \big[ X - E(X|Y) + E(X|Y) - EX \big]^2 \big) \end{align*}
The first $E$ is $E_X$. The second $E$ is not $E_X$, but $E_{X, Y}$.
Step 2
When we say $EX$, we $E_X X$. However, when we say $E[X-E[X|Y]]$, we actually mean $E_{X, Y}[X-E[X|Y]]$.
To see this, note that $E[X|Y]$ is $g(Y)$, and $X-g(Y)$ is $W$, which is just another random variable. Taking expectation of $W$, we should expect a number, instead of a function of $Y$. That is, there should not be any randomness.
Think about it in this way. Before, you somehow know X's distribution. So you can calculate $EX$. However, later you realize that X actually depends on Y, and you have a new model/understanding on X, which is: when Y=$y_1$, X has one distribution, and when Y=$y_2$, X has one distribution and so on. Then you can also calculate $EX$, but this time, $EX = E_Y E_{X|Y}[X|Y]$.
Maybe the following notation is clearer:
\begin{align*} E[X-E[X|Y]] &= E[X-g(Y)] \\ &= E_{X, Y}[X-g(Y)] \\ &= \int_x \int_y (x-g(y)) f_{X, Y}(x, y) dy dx \\ \end{align*}
You can see above we should really consider the joint distribution.
Now,
\begin{align*} E_{X, Y}[X-g(Y)] &= E_Y E_{(X, Y)|Y}[X-g(Y) | Y] \\ &= E_Y E_{X|Y}[X-g(Y) | Y] \\ &= E_Y E_{X|Y}[X-E[X|Y] | Y] \\ &= E_Y \big[ E_{X|Y}[X|Y]- E[X|Y] \big] \\ &= 0 \end{align*}