I am refering to this course on sampling https://www.youtube.com/watch?v=TNZk8lo4e-Q. At around minute 6, the lecturer shows on the slide the posterior probability factored as:
$$p(\Theta |X,Y)=\frac{p(Y|X,\Theta)p(\Theta)}{Z}$$ where Z is the normalizing constant.
According to the product rule, the numerator should be
$$p(\Theta )p(X|\Theta )p(Y|X,\Theta )$$
What is the reason for dropping $p(X|\Theta )$? Thanks alot for your help!
You can't be too mathematical with machine learning, but this is one way to think about it equationally:
$$\begin{split}p(\Theta|X,Y)&=\frac{p(\Theta,Y|X)}{p(Y|X)}\\ &=\frac{p(Y|\Theta,X)p(\Theta|X)}{\int p(Y|\Theta,X)p(\Theta|X)d\Theta}\\ &=\frac{p(Y|\Theta,X)p(\Theta)}{\int p(Y|\Theta,X)p(\Theta)d\Theta}\end{split}$$
where the last equation follows from the prior distribution of $\Theta$ does not depend on the input values X. X are the givens (predictors).