Iterated conditional probability notation

Question

Iterated conditional probability notation

369 Views Asked by user38268 At 11 May 2026 - 3:41

I'm currently self-studying Andrew Gelman's book "Bayesian Data Analysis" third edition. At the page 41, they write:

$E(\tilde{y}|y)=E(E(\tilde{y}|\theta,y)|y)$

I am ok with multiple conditions, and usually ok with the math involved in the book. But this notation with a nested condition confuses me. I tried to find a definition for nested conditions in past personal notes/books, but I did not find the definition. Online, I saw something similar in the "Tower property" there https://en.wikipedia.org/wiki/Conditional_expectation#Basic_properties. But that page uses notation and concepts that the book does not use and that are a bit abstract.

I have a feeling that this is the definition I'm looking for:

Does someone have an online reference that defines nested conditions that confirms (or not) my feeling guessed definition? If I'm wrong, what would be the meaning of $Pr[(A|B)|C]$ ,and $E(E(x|z,y)|y)$ (if any)?

Thank you for your help!

Original Q&A

There are 3 best solutions below

Bumbble Comm On 09 Jun 2022 - 8:38

FOR THE SAKE OF EXPLANATION: I will denote with $t$ the sample points of the variate $\theta,$ $\tilde{\omega}$ those of $\tilde{y}$ and with $\omega$ those $y.$ Then, the random vector $(\theta, \tilde{y}, y)$ is assumed to have a density $f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, \omega)$ and that $$ E(E(\tilde{y} \mid \theta, y) \mid y) = \int dt\ \left[ \int d\tilde{\omega}\ \tilde{\omega} f_{\tilde{y} \mid \theta, y}(\tilde{\omega} \mid t, y) \right] f_{\theta \mid y}(t \mid y) $$ coincides with $$ E(\tilde{y} \mid y) = \int d\tilde{\omega}\ \tilde{\omega} f_{\cdot, \tilde{y}, \mid y}(\tilde{\omega} \mid y), $$ where $\cdot$ means that the variable "was integrated out and renormalised." By definition, $$ f_{\cdot, \tilde{y}, \mid y}(\tilde{\omega} \mid y) = \dfrac{\int dt\ f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)}{\int d(t, \tilde{\omega})\ f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)}, $$ and $$ f_{\tilde{y} \mid \theta, y}(\tilde{\omega} \mid t, y) f_{\theta \mid y}(t \mid y) = \dfrac{f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)}{\int d\tilde{\omega}\ f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)} \dfrac{\int d\tilde{\omega}\ f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)}{\int d(t, \tilde{\omega})\ f_{\theta, \tilde{y}, y}(t, \tilde{\omega}, y)}. $$ Therefore, there is a cancellation and when you multiply by $\tilde{\omega}$ and integrate, you reach the same expressions. By the way, we have to assume that $|\theta| + |\tilde{y}| + |y|$ is an integrable random variable. (The same result applies for random vectors but we have to assume that their $\mathbf{L}^1$ norms are integrable.)

Bumbble Comm On 10 Jun 2022 - 3:16

$P((A \mid B) \mid C)$ does not make sense. But $E(E(X \mid Z, Y) \mid Y)$ makes sense. To parse it, first note that $E(X \mid Z, Y)$ is a random variable which is a function of $Z$ and $Y$. So we can write $E(X \mid Z, Y) = f(Z, Y)$. Now $f(Z, Y)$ is a random variable and $E(f(Z, Y) \mid Y)$ makes sense.

The equality $E(E(X \mid Z, Y) \mid Y) = E(X \mid Y)$ is a consequence of the abstract tower property. Some form of the tower property can be proven using conditional densities. You can prove this specific identity like this: \begin{align} E(X \mid Y = y) &= \int x f(x \mid y)\,dx \\ &= \int x \int f(z \mid y)f(x \mid y, z)\,dz\,dx \\ &= \int \int x f(x \mid y, z)\,dx f(z \mid y)\,dz \\ &= \int E(X \mid Y = y, Z = z) f(z \mid y)\,dz \\ &= E(E(X \mid Y = y, Z) \mid Y = y) \\ &= E(E(X \mid Y, Z) \mid Y = y). \end{align}

**user38268** · Accepted Answer

With the help of @Mason, and @William M., and taking the time to review more in depths the law of total expectation, I found the source of my confusion : I was incorrectly using the law of total expectation.

The law of total expectation says $E[U]=E(E[U|Z])$, but it also says that $U$ and $Z$ must be from the same probability space. My mistake was to consider $U=(X|Y=y)$, add the condition with $Z$ ending up with $E[E(X|Z,Y=y)]$, and then I did not understand the necessity for the extra $|Y=y$. I was just adding a condition ignoring that the random variables were living in a restricted probability space. This was my mistake.

$E[X|Y=y]=E[E(X|Z,Y=y)|Y=y]$ is really a direct consequence of $E[U]=E(E[U|Z])$ but with the restriction that everything is happening in the probability space defined by the condition $Y=y$. We add the condition $Z$, but we remain in the probability space $|Y=y$. This is the meaning behind the extra $|Y=y$ I did not write at my first attempt.

$E[E(X|Z,Y=y)|Y=y]\ne E[E(X|Z,Y=y)]$ and I have a good example from the book of Bayesian statistics. In my original problem $E[E(\tilde{Y}|\theta ,Y)|Y]=E[\theta|Y]=\mu_1$ the posterior mean but $E[E(\tilde{Y}|\theta ,Y)]=E[\theta]=\mu_0$ the prior mean.

I don't think my digression on expectation written with subscripts is necessary anymore. So I removed it.

I hope that this answer will help if someone stumbles over the same difficulty!

PS: I think it also clarifies the question about $(A|B)|C$.

Iterated conditional probability notation

There are 3 best solutions below

Related Questions in MEASURE-THEORY

Related Questions in DEFINITION

Related Questions in CONDITIONAL-PROBABILITY

Related Questions in CONDITIONAL-EXPECTATION

Related Questions in ITERATED-INTEGRALS

Trending Questions

Popular # Hahtags

Popular Questions