Intuitive explanation of the tower property of conditional expectation

49.5k Views Asked by At

I understand how to define conditional expectation and how to prove that it exists.

Further, I think I understand what conditional expectation means intuitively. I can also prove the tower property, that is if $X$ and $Y$ are random variables (or $Y$ a $\sigma$-field) then we have that

$$\mathbb E[X] = \mathbb{E}[\mathbb E [X | Y]].$$

My question is: What is the intuitive meaning of this? It seems quite puzzling to me.

(I could find similar questions but not this one.)

3

There are 3 best solutions below

3
On BEST ANSWER

For simple discrete situations from which one obtains most basic intuitions, the meaning is clear.

I have a large bag of biased coins. Suppose that half of them favour heads, probability of head $0.7$. Two-fifths of them favour heads, probability of head $0.8$. And the rest favour heads, probability of head $0.9$.

Pick a coin at random, toss it, say once. To find the expected number of heads, calculate the expectations, given the various biasing possibilities. Then average the answers, taking into consideration the proportions of the various types of coin.

It is intuitively clear that this formal procedure "should" give about the same answer as the highly informal process of say repeating the experiment $1000$ times, and dividing by $1000$. For if we do that, in about $500$ cases we will get the first type of coin, and out of these $500$ we will get about $350$ heads, and so on. The informal arithmetic mirrors exactly the more formal process described in the preceding paragraph.

If it is more persuasive, we can imagine tossing the chosen coin $12$ times.

6
On

First, recall that in $E[X|Y]$ we are taking the expectation with respect to $X$, and so it can be written as $E[X|Y]=E_X[X|Y]=g(Y)$ . Because it's a function of $Y$, it's a random variable, and hence we can take its expectation (with respect to $Y$ now). So the double expectation should be read as $E_Y[E_X[X|Y]]$.

About the intuitive meaning, there are several approaches. I like to think of the expectation as a kind of predictor/guess (indeed, it's the predictor that minimizes the mean squared error).

Suppose for example that $X, Y$ are two (positively) correlated variables, say the weigth and height of persons from a given population. The expectation of the weight $E(X)$ would be my best guess of the weight of a unknown person: I'd bet for this value, if not given more data (my uninformed bet is constant). Instead, if I know the height, I'd bet for $E(X | Y)$ : that means that for different persons I'd bet a diferent value, and my informed bet would not be constant: sometimes I'd bet more that the "uninformed bet" $E(X)$ (for tall persons) , sometime less. The natural question arises, can I say something about my informed bet in average? Well, the tower property answers: In average, you'll bet the same.


Added : I agree (ten years later) with @Did 's comment below. My notation here is misleading, an expectation is defined in itself, it makes little or no sense to specify "with respect to $Y$". In my answer here I try to clarify this, and reconcile this fact with the (many) examples where one qualifies (subscripts) the expectation (with respect of ...).

0
On

The expected value of $X$ is still the expected value of $X$ when you take into account the possible values of $Y$.