Conditional Probability where conditioning is on multiple variables

184 Views Asked by At

Within Murphy's book Machine Learning: A probabilistic perspective, on page 77 he writes the following:

$p(\theta|D', D'') \propto p(D''|\theta)p(\theta|D')$

Where $D' and D''$ are data sets and $\theta$ the parameter. I have an understanding of the chain rule of probabilities but am stumped here as to how this is formulated

I wonder how this result was achieved? If someone could explain I would be very grateful

Book link: https://www.cs.ubc.ca/~murphyk/MLbook/

1

There are 1 best solutions below

2
On BEST ANSWER

The chain rule for probabilities states that $$p(\theta , D' , D'') = p(D''|\theta, D')p(\theta \: |D')p(D')$$ Now usually in a Bayesian setup we assume that data is conditionally independent on the parameter $\theta$, which means that $p(D',D''|\theta) = p(D'| \theta)p(D'' | \theta)$, and it implies that $p(D'' | \theta , D') = p(D'' | \theta)$.

Now we have that \begin{align*} p(\theta | D',D'') &\propto p(\theta \: , D', D'') \\ &=p(D'' | \theta)p(\theta | D')p(D') \\ &\propto p(D''|\theta)p(\theta|D') \end{align*}