$E[E[X\mid Y] =$ $\sum_{y\mid P(Y=y>0)}E[X\mid Y=y] \cdot P(Y =y)$

157 Views Asked by At

I'm looking at the proof of

$$E[X] = E[E[X\mid Y]]$$

But I'm having trouble to get why does, for example if we take X,Y discrete random variables, we have that

$$E[E[X\mid Y]]= \sum_{y\mid P(Y=y)>0}E[X\mid Y=y] \cdot P(Y =y)$$

I know that

$E[X\mid Y]$ can be defined as a random variable $E[X\mid Y=y](\omega)$ if $ Y(\omega) = y$

but from that I lose a bit of intuition.

Does it mean that if we want the expected value of $E[X\mid Y]$ knowing that its a random variable where the variable is $y \in \operatorname{Im}(Y)$ then, we must simply all the "probability space of the expected value of $X$ knowing $Y = y$?

Basically, if someone has a good intuitive explanation I'd be so happy!

Also, it is written that

$$\sum_{y\mid P(Y=y)>0}E[X\mid Y=y]\cdot P(Y =y) = \sum_{y\mid P(Y=y)>0} \frac{E[X\cdot \mathbb{1}_{(Y=y)}]}{P(Y=y)} P(Y=y)$$

(where $1$ is the indicating function)

$$= \sum_{y\mid P(Y=y)>0}E[X\mid\mathbb{1}_{(Y=y)}] = E[X\mid\mathbb{1}_{(Y=y)}] = E[X]$$

and i'm also having trouble following that part.

Thank you all for everything!

2

There are 2 best solutions below

0
On BEST ANSWER

For the first equation:

$$E[E[X\mid Y]]= \sum_{y\mid P(Y=y)>0}E[X\mid Y=y] \cdot P(Y =y)$$

It helps to think of $E[X\mid Y]$ as a function $f(Y)$, which is a random variable (see here). Then, as for all expected values, you iterate over all the possible values of the random variable multiplying its probability:

$$E[f(Y)]= \sum_{y \mid P(Y=y)>0} f(y) \cdot P(Y =y)$$

For your doubt on your last equations, I will write the proof in another way. Maybe it helps you.

$$E[E[X\mid Y]] = \sum_{y \mid P(Y=y)>0} E[X\mid Y = y] \cdot P(Y = y) = \sum_{y \mid P(Y=y)>0} ( \sum_{x \in X} x P(X = x \mid Y = y))\cdot P(Y = y)$$

$$= \sum_{y\mid P(Y=y)>0} (\sum_{x \in X} x \dfrac{P(X = x; Y = y)}{P(Y = y)})\cdot P(Y = y) = \sum_{x \in X} x \sum_{y \mid P(Y=y)>0} P(X = x; Y = y)$$

$$ = \sum_{x \in X} x P(X = x) = E[X]$$

0
On

I know that $E[X\mid Y]$ can be defined as a random variable $E[X\mid Y=y](\omega)$ if $ Y(\omega) = y$

Not quite; it is the other way.   $\mathsf E[X\mid Y=y]$ is defined as the value of random variable, $\mathsf E[X\mid Y],$ for all outcomes, $\omega$, where $Y(\omega)=y$. $\forall \omega\in Y^{-1}(y): \mathsf E(X\mid Y)(\omega)=\mathsf E(X\mid Y=y)$

So $$\begin{align}\mathsf E(\mathsf E(X\mid Y)) &= \sum_{\omega\in\Omega} \mathsf E(X\mid Y)(\omega)\cdot\mathsf P\{\omega\} &&\text{by definition} \\[1ex]&= \sum_{y\in Y(\Omega)}\sum_{\omega\in Y^{-1}(y)} \mathsf E(X\mid Y)(\omega)\cdot\mathsf P\{\omega\} &&\text{Partitioning the series} \\[1ex] &= \sum_{y\in Y(\Omega)} \mathsf E(X\mid Y=y) \sum_{\omega\in Y^{-1}(\Omega)}\mathsf P\{\omega\}&&\text{by definition of }\mathsf E(X\mid Y=y) \\[1ex] &=\sum_{y\in Y(\Omega)}\mathsf E(X\mid Y=y)\cdot\mathsf P\{\omega\in\Omega:Y(\omega)=y\} &&\text{by countable additivity} \\[1ex] &=\sum_{y\in Y(\Omega)}\mathsf E(X\mid Y=y)\cdot\mathsf P(Y=y) &&\text{Abreviation}\end{align}$$

Now, what kind of value is $\mathsf E(X\mid Y=y)$?   Well for any event $E$ with nonzero probability measure, we define $\mathsf E(X\mid E)=\mathsf E(X\mathbf 1_E)\div\mathsf P(E)$.

$$\begin{align}\mathsf E(\mathsf E(X\mid Y))&=\sum_{y:\mathsf P(Y=y)>0} \dfrac{\mathsf E(X\mathbf 1_{Y=y})\mathsf P(Y=y)}{\mathsf P(Y=y)}+\sum_{y:\mathsf P(Y=y)=0}0 \\[1ex] &= \sum_{y:\mathsf P(Y=y)>0} \mathsf E(X\mathbf 1_{Y=y}) \\[1ex] &= \sum_{y:\mathsf P(Y=y)>0}\sum_{\omega\in\Omega:Y(\omega)=y} X(\omega)\mathsf P\{\omega\} \\[1ex] &= \sum_{\omega\in\Omega}X(\omega)\mathsf P\{\omega\} \\[1ex] &= \mathsf E(X)\end{align}$$