How did the posterior distribution get factorized in this manner? (bayes rule)

48 Views Asked by At

I am refering to this course on sampling https://www.youtube.com/watch?v=TNZk8lo4e-Q. At around minute 6, the lecturer shows on the slide the posterior probability factored as:

$$p(\Theta |X,Y)=\frac{p(Y|X,\Theta)p(\Theta)}{Z}$$ where Z is the normalizing constant.

According to the product rule, the numerator should be

$$p(\Theta )p(X|\Theta )p(Y|X,\Theta )$$

What is the reason for dropping $p(X|\Theta )$? Thanks alot for your help!

1

There are 1 best solutions below

1
On BEST ANSWER

You can't be too mathematical with machine learning, but this is one way to think about it equationally:

$$\begin{split}p(\Theta|X,Y)&=\frac{p(\Theta,Y|X)}{p(Y|X)}\\ &=\frac{p(Y|\Theta,X)p(\Theta|X)}{\int p(Y|\Theta,X)p(\Theta|X)d\Theta}\\ &=\frac{p(Y|\Theta,X)p(\Theta)}{\int p(Y|\Theta,X)p(\Theta)d\Theta}\end{split}$$

where the last equation follows from the prior distribution of $\Theta$ does not depend on the input values X. X are the givens (predictors).