How did they apply the Tower property $E[X] = E[E[X\mid Y]]$

68 Views Asked by At

Iam trying to understand a proof in my book but there is one detail that i don't get, here it is:

Let $C_1,C_2,...C_{j+k}$ be random varibles:

$E[C_{j+k} \mid C_1,...C_{j} ] = E\Big[ E[C_{j+k} \mid C_1,...C_{j+k-1}] \quad \mid C_1,...,C_j \Big] $

I know they are using $E[X] = E[E[X\mid Y]]$ somehow but i can't figure out how

1

There are 1 best solutions below

3
On

Perhaps the image below will clarify further why:

$$E[A|\color{blue}{B}] = E\left[\;E[A|C,\color{blue}{B}]\;|\color{blue}{B}\right]$$

Warning: the following is not a formal explanation, more a intuitivly feel to the matter:

Let $A$ be represented by the square below. (I'm using Bernouilli variables for simplicity) You know already: $$E[A] = E\left[ \;E[A|B] \;\right]$$ Or in more friendly words: "Averaging over $A$ is like taking the average of the averages of the subsets $A|B=1$ and $A|B=0$."

Nu consider an extra random variable $C$.

Consider $E[A|B]$ which is dependant of the value of $B$ since $E[A|B=0]$ and $E[A|B=1]$ could return different answers. (although in the picture the areas are similar)

For instance $E[A|B=1]$ can be written down (using the 'tower rule') as $E\left[ \; E[A|C,B=1]\;\right]$ Or once again "Averaging over $A|B=1$ is like taking the average of averages on the subparts $A|C=0,B=1$ and $A|C=1,B=1$."

Similar $E[A|B=0] = E\left[ \; E[A|C,B=0]\;\right]$

But since the original question was $E[A|B]$ you don't know which iterated expected value you should use. That's why you need extra condition on $B$. $$E[A|\color{blue}{B}] = E\left[\;E[A|C,\color{blue}{B}]\;|\color{blue}{B}\right]$$

Conditional