Why does iterated conditioning ignore later information?

103 Views Asked by At

Iterated conditioning states the following. Let $X$ be a random variable and $\mathcal{G}_1 \leq \mathcal{G}_2$ be two $\sigma$-algebras. Then,

$\mathbb{E}[\mathbb{E}[X\vert \mathcal{G}_2 ]\vert \mathcal{G}_1]=\mathbb{E}[X\vert \mathcal{G}_1]$

Since $X$ is conditioned on $\mathcal{G}_2$, shouldn't the value of $X$ be known by that time? So the equation should be,

$\mathbb{E}[X\vert \mathcal{G}_2 ] = \color{red}c$

Where $c$ is some constant. If this isn't true, it seems like we throw away the information in $\mathcal{G}_2$ and only use the lesser info found in $\mathcal{G}_1$. At the very least, I would have written this as (which I know is wrong),

$\mathbb{E}[\mathbb{E}[X\vert \mathcal{G}_2 ]\vert \mathcal{G}_1]=\mathbb{E}[X\vert \color{red}{\mathcal{G}_2}]$

Obviously, I don't have mathematical maturity, so maybe someone kind could enlighten me?

1

There are 1 best solutions below

2
On

The bigger $\mathcal{G}$ is, the closer $\mathbb{E}[X \mid \mathcal{G}]$ is to $X$ (the actual random variable $X$). The smaller $\mathcal{G}$ is, the closer $\mathbb{E}[X \mid \mathcal{G}]$ is to $\mathbb{E}[X]$ (the "best constant approximation" to $X$).

For an example, say $X$ is the number of heads in two flips of an unbiased coin, $\mathcal{G}_2$ is just $\Omega$, and $\mathcal{G}_1$ is generated by the outcome of the first flip. Then $\mathbb{E}[X \mid \mathcal{G}_2]$ is just $X$; if I know both flips, I can tell you exactly what $X$ was, whether it was $0,1$ or $2$. If you take the $\mathcal{G}_1$ conditional expectation of that, you get the "best guess" for $X$ given information about the first flip. This will depend on what happened in the first flip: if the first flip was heads, it will be $1.5$, if the first flip was tails, it will be $0.5$.