Could anyone derive or explain why the formula $P(X,Y|Z)=P(Y|X,Z)P(X|Z)$ is true? I understand conditional probability definition, but this formula confuses me and makes my head hurt x)
Here's another similar which I have struggled to understand:
$$p(\mathbf{x}, \mathbf{\theta}|\mathcal{X})=p(\mathbf{x}|\mathbf{\theta},\mathcal{X})p(\mathbf{\theta}|\mathcal{X})$$
Could someone explain this one to me as well? This formula is from my neural networks book, but I have no idea why this is true, even though I understand the basic conditional probability formula $P(A|B) = \displaystyle\frac{P(A,B)}{P(B)}$. If I use this formula, what I would do for my book example is this:
$P(\mathbf{x},\mathbf{\theta}|\mathcal{X}) = \displaystyle\frac{P(\mathbf{x},\mathbf{\theta},\mathcal{X})}{P(\mathcal{X})}$
Could someone ease my frustration ;D
Thank you!
From $$ P(X,Y|Z) = \frac{P(X,Y,Z)}{P(Z)}$$ $$ P(Y|X,Z) = \frac{P(X,Y,Z)}{P(X,Z)}$$ $$ P(X|Z) = \frac{P(X,Z)}{P(Z)}$$ we have $$P(Y|X,Z) P(X|Z)=\frac{P(X,Y,Z)}{P(X,Z)}\frac{P(X,Z)}{P(Z)}=\frac{P(X,Y,Z)}{P(Z)}=P(X,Y|Z).$$ This is also intuitive: The probability that $X$ and $Y$ happen if we know that $Z$ happens is the same as the probability that $X$ happens when we know that $Z$ happens and that then $Y$ happens when we know that $X$ and $Z$ happen.