Specific question about the proof Law of total expectation

716 Views Asked by At

Can someone please explain me just the yellow part? Why is the first step is equal to the second?

Proof:

enter image description here

1

There are 1 best solutions below

1
On BEST ANSWER

I'm not sure if this will help with what is confusing, Amit, but I think it might.

To get from $E[E[X|Y]]$ to $E\left[\sum_x x \, P(X=x|Y)\right]$ we just replace the inner expectation with the definition of conditional expectation applied to $E[X|Y]$.

A conditional expectation is actually a random variable, though. In this case, it is a function from possible values of $Y$ to simple (non-conditional) expectations. For each value $Y=y$, we have a different simple expectation. Think of this as the expectation of $X$, restricted to the part of the probability space where $Y=y$. What is the expected value of $X$ within this subset of outcomes, i.e. when we fix the value of $y$? For each such $y$, we have a different (simple) expectation. That's why a conditional expectation is a function, a random variable.

To get to the next line,

$$\sum_y\left[\sum_x x P(X=x|Y=y)\right]P(Y=y)$$

we are simply taking the (non-conditional) expectation of what's inside the outer expectation operator's scope. That is, we are taking the expectation of a random variable that is a function of $y$, $E[X|Y=y]=\sum_x x\,P(X=x|Y=y)$. Notice that while $x$ is bound by the summation operator, $y$ is not. This notation highlights the role of $y$ in the conditional expectation, treated as a function of $y$.

The second line of the proof, shown indented above, is precisely the application of the expectation operator to the random variable $E[X|Y=y]$.