Consider random variables $X$ and $Y$ with $\mathbb{E}[X^2] < \infty$. Show that if $g : \mathbb{R} \to \mathbb{R}$ is the function that minimizes $ \mathbb{E}[(X -g(Y))^2]$, then $g(Y) = \mathbb{E}[X\mid Y]$.
My approach -: $\mathbb{E}[|X|] < \infty$ also $\mathbb{E}[(X-g(Y))^2] \leq \mathbb{E}[(X-f(Y))^2]$ for function $f$ and somehow use the fact that
if $\mathbb{E}[X] < \infty$ and $\mathbb{E}[f(Y)X] < \infty$ then $\mathbb{E}[f(Y)X \mid Y] = f(Y)\mathbb{E}[X \mid Y]$
I am not sure how to tie all this together and formulate a cohesive argument
Observe $$ \begin{align} [X - g(Y)]^2 &= [X-\mathbb{E}[X \mid Y] + \mathbb{E}[X \mid Y] - g(Y)]^2 \\ &= \left\{X - \mathbb{E}[X \mid Y]\right\}^2 + 2\left\{X - \mathbb{E}[X \mid Y]\right\}\{\mathbb{E}[X \mid Y] - g(Y)\} \\ &\qquad + \left\{\mathbb{E}[X \mid Y] - g(Y)\right\}^2\text{} \end{align}$$ Now, use linearity of expectation. We focus on the second term for now. $$\mathbb{E}\left\{2\left\{X - \mathbb{E}[X \mid Y]\right\}\{\mathbb{E}[X \mid Y] - g(Y)\}\right\}\tag{*}$$ By double expectation, write the above as $$\mathbb{E}\left\{\mathbb{E}\left\{2\left\{X - \mathbb{E}[X \mid Y]\right\}\{\mathbb{E}[X \mid Y] - g(Y)\}\mid Y\right\}\right\}$$ The term $$\{\mathbb{E}[X \mid Y] - g(Y)\}$$ depends on $Y$ only and can be pulled out of the innermost expectation with the $2$, yielding $$\mathbb{E}\left\{2\{\mathbb{E}[X \mid Y] - g(Y)\}\mathbb{E}\left\{X - \mathbb{E}[X \mid Y]\mid Y\right\}\right\}$$ Furthermore, $$\mathbb{E}\left\{X - \mathbb{E}[X \mid Y]\mid Y\right\} = \mathbb{E}[X \mid Y] - \mathbb{E}[\mathbb{E}[X \mid Y] \mid Y] = \mathbb{E}[X \mid Y] - \mathbb{E}[X \mid Y] = 0$$ thus it follows that (*) gives $0$. Hence $$\mathbb{E}[X - g(Y)]^2 = \mathbb{E}\left\{X - \mathbb{E}[X \mid Y]\right\}^2+ \mathbb{E}\left\{\mathbb{E}[X \mid Y] - g(Y)\right\}^2\tag{**}$$ The first term of the right-hand side of (**) does not depend on $g(Y)$, so we ignore it. However, the second term does, and furthermore, it is a non-negative quantity, because it is an expectation of a squared quantity. Thus we know that $$\mathbb{E}\left\{\mathbb{E}[X \mid Y] - g(Y)\right\}^2$$ is minimized when $$\mathbb{E}\left\{\mathbb{E}[X \mid Y] - g(Y)\right\}^2 = 0$$ which is when $$\left\{\mathbb{E}[X \mid Y] - g(Y)\right\}^2 = 0$$ or $$g(Y) = \mathbb{E}[X \mid Y]\text{.}$$