In Andrew NG's Lectures (CS229), the Bayesian Logistic Regression section contained a formula;
$$P(Y|X,S)=\int_\theta P(Y|X,\theta)P(\theta|S)d\theta$$
Here, $\theta$ is treated as a random variable. $S$ is the set of points ${[X^{(i)},Y^{(i)}]_{i=1}^{m}}$.
Using conditional probabilities, it does make intuitive sense although I would really appreciate a rigorous proof of the equation.
From what I got: $$ P(Y|X,S)=\int_\theta P(Y,\theta|X,S)d\theta $$ $$ = \int_\theta P(Y|\theta,X,S)P(\theta|X,S)d\theta$$
Does it assume any sort of independence?
Again, I get the intuition, but I can't seem to arrive at the final answer. A written out proof or required equations to prove the result would be appreciated.
The assumption applied is that of conditional independence. If $Z_1$ and $Z_2$ are independent then $p(Z_1\lvert Z_2) = P(Z_1)$ hence $Z_2$ can be removed from the set of variables on which one is conditioning. Similarly with conditional independence if $Z_1$ and $Z_3$ are conditionally independent given $Z_2$ then $p(Z_1\lvert Z_2,Z_3) = p(Z_1\lvert Z_2)$ so ones $Z_2$ is "controlled for" $Z_1$ does not depend on $Z_3$.
So the author is assuming that
$$p(Y\lvert \theta , X, S) = p(Y\lvert \theta ,X)$$
the dependent variable $Y$ only depends on $S$ through $X$ and $\theta$
$$p(\theta \lvert X,S) = p(\theta \lvert S)$$
so parameters $\theta$ do not depend on $X$ once $S$ is given.