Expectation of function of two random variables conditioned on one r.v. of the function itself

301 Views Asked by At

I am trying to understand the following.

We have two jointly distributed discrete random variables $X$ and $Y$. We are trying to use $X$ to predict $Y$. Specifically, let us use decide some function $h(X)$ that we can use to predict $Y$, such that $h(X)$ is optimal. That is, $h(X)$ minimizes the $\text{MSE}=E\{[Y-h(X)]^2\}$.

Looking at this expectation, I believe to express the expectation as a summation we would have

$$\sum_{x,y} p_{X,Y}(x,y)(y-h(x))^2,$$

where $p_{X,Y}$ is the joint pmf of $X$ and $Y$.

But here is the next step, we have (from the law of total expectation):

$$E\{[Y-h(X)]^2\}=E(E\{[Y-h(X)]^2\mid X\})$$ where we sum the outer expectation with respect to $X$. In realizing that the inner expectation is minimized by setting $h(x)$ equal to $E(Y\mid X=x)$ we realize we can minimize the $\text{MSE}$.

My question is, would it be true that $E(E\{[Y-h(X)]^2\mid X\}) $is equal to the following double sum?

$$\sum_x \sum_y p_{Y\mid X}(y\mid x)p_X(x)(y-h(x))^2.$$

I am trying to expand out the expectation to see mathematically what is going on here. I understand the intuition behind the idea of having $h(x)$ equal to $E(Y\mid X=x)$.

1

There are 1 best solutions below

0
On BEST ANSWER

You can directly get $$\sum_x \sum_y p_{X,Y}(x,y) (y-h(x))^2 = \sum_x p_X(x) \sum_y p_{Y \mid X}(y \mid x) (y - h(x))^2.$$