In the book "Bayesian Data Analysis" by Gelman et al, they write probability densities that are both joint and conditional, such as $p (\tilde{y},\theta\vert y)$. How do I manipulate with these quantities formally?
For example, in Eq. 1.4, they write
$\int p (\tilde{y},\theta\vert y) d \theta = \int p (\tilde{y} \vert \theta,y ) p(\theta \vert y) d \theta$
Can someone please explain how the left side is equal to the right side?
In this example, $y$ is the data observed, generated from a process with parameter $\theta$. $\tilde{y}$ is the predicted data generated from the same process.
It's a trick! Try to see it using events. You have to recognise that $$P(A \cap B \cap C| C) = P(A \cap B | C)$$. By chain rule, we have
$$LHS = P(A \cap B \cap C| C) = P(A| B \cap C)P(B|C)$$
Thus, we have that
$$P(A \cap B | C) = P(A| B \cap C)P(B|C)$$