How can we explain the fact that $\mathbb E\left( {\mathbb E\left( {X|Y} \right)} \right) = \mathbb E(X)$?

367 Views Asked by At

We know that the following property conditional expectation holds, assuming $\mathbb E[|X|]<\infty$:

$\mathbb E\left( {\mathbb E\left( {X|Y} \right)} \right) = \mathbb E(X)$

Could anyone give me some intuition into this? why when we take the expected value again, knowing the random variable Y will not affect the expected value of X.

2

There are 2 best solutions below

0
On

I don't know if these diagrams will be of any use to you. I find it useful to think of conditioning as putting a transparency over the sample space.

Let $X_1, X_2$ be two fair, independent coinflips. Denote the outcome of heads by $0$ and the outcome of tails by $1$. Let $S = X_1 + X_2$. The sample space $\Omega$ has four points: $\{(0,0), (0,1), (1,0), (1,1)\} = \{\omega_1, \omega_2, \omega_3, \omega_4\}$, :

enter image description here

First consider the inner conditional expectation, $Z = E[S | X_2]$. Note that $Z$ is a random variable: for each $\omega \in \Omega$, $Z(\omega)$ is a real number. It's simply that $Z(\omega)$ is constant on the sets $X_2^{-1}(\{0\}) = \{(0,0), (1,0)\} = \{\omega_1, \omega_3\}$ and $X_2^{-1}(\{1\}) = \{(0,1),(1,1)\} = \{\omega_2, \omega_4\}.$ In diagram format,

enter image description here

What is the constant value of $Z$ when $X_2 = 0$? It is the conditional probability of $S$ given $X_2 = 0$, which is an average over the $\omega$'s in the set $\{\omega : X_2(\omega) = 0\}$: $$E[S | X_2](\omega_1) = E[S|X_2](\omega_3) = 0.5.$$ Similarly, $$E[S | X_2](\omega_2) = E[S|X_2](\omega_4) = 1.5.$$ enter image description here

Now what happens when you do $E[E[S|X_2]]$? You average again. The rule $E[E[S|X_2]] = E[S]$ can be (roughly) read as "the average of the partial averages is the full average".

0
On

The intuition - meaning let's forget about rigor and probability spaces for a second, and just develop a mental cartoon picture of the concept - is actually quite straightforward:

Inside the LHS of the equation $\mathbb E\left( {\mathbb E\left( {X|Y} \right)} \right) = \mathbb E(X)$ we find $\color{blue}{\mathbb E\left( {X|Y} \right)},$ which conditions the expectation of the random variable $X$ on the value of the random variable $Y.$ As such, $ {\mathbb E\left( {X|Y} \right)}$ is actually a random variable, and a function of $Y.$ It is not a number, but a measurable function ${\mathbb E\left( {X|Y} \right)}:Y \to \mathbb R.$ As you slide across the domain of $Y,$ the expectation of $X$ changes, provided they are dependent.

But this expression $\color{blue}{\mathbb E\left( {X|Y} \right)}$ is further enclosed within the operator $ {\mathbb E\left( \cdot \right)},$ which means that we are actually looking for the expectation (weighted mean value) across all values of $Y.$ In doing so, we are essentially integrating, and making the individual values of $Y$ irrelevant.

Pictorially,

enter image description here

Again, just an intuition!