I'm having difficulty understanding/accepting this equation I saw in a course I'm doing:
When I try to derive it using the usual way you derive Bayes rule by equating two ways of writing conditional probability:
$$ P(\theta,X,y) = P(\theta)P(X|\theta)P(y|X,\theta) $$ $$ P(\theta,X,y) = P(y)P(X|y)P(\theta|X,y) $$ and going from there, I get
$$ P(\theta|X_{tr}y_{tr}) = \frac{P(y_{tr}|X_{tr},\theta)P(\theta|X_{tr})}{P(y_{tr}|X_{tr})}. $$
So why does my straightforward derivation disagree with their equation?
