Understanding theorem on expected value of a function $E(h(X))$

211 Views Asked by At

The theorem states:

Let X, Y be random variables defined over $(\Omega,\Im,P)$ and $h$ a real-valued function such that $h(X)$ is a random variable. If $E(h(X))$ exists, then: $E(h(X))=E(E(h(X))|Y).$

I don't understand the beginning of the proof:

$$E(E(h(X))|Y) = \sum\limits_y E(h(X)|Y=y)P(Y=y)$$

I don't know how to get from the definition of conditional expected value to the last equation. I appreciate your help.

1

There are 1 best solutions below

6
On BEST ANSWER

$$ E[h(X)|Y] $$ is a random variable: it has the form $$ E[h(X)|Y] = g(Y) $$for some deterministic function $g$.

So, the randomness of this random variable does only depend on $Y$: $$ E[h(X)|Y](\omega) = g(Y(\omega)) $$ and so the expected value is $$ E[E[h(X)|Y]] =E[g(Y)] = \sum P(Y = y) g(y) $$ now the strange notation is $$ E[h(X)|Y=y] = g(y) $$ and you get $$ E[E[h(X)|Y]] =\sum P(Y = y) E[h(X)|Y=y]$$