Proof of this equality in "Bayesian prediction"

47 Views Asked by At

$p(y|x,S) = \int_{\theta}p(y|x,\theta)p(\theta|S)d\theta$

Is this formula correct in general or just in bayesian settings? How do I prove it?

Note: this formula comes from stanford cs229 course lecture notes. Page 7, formula(2) from http://cs229.stanford.edu/notes/cs229-notes5.pdf

1

There are 1 best solutions below

2
On

The set-up is that $S$ are the data, $x$ is the conditional value, and you are trying to predict the posterior density of $y$.

What is happening is that there is some "deep parameter" $\theta$ about which you have a posterior based on data, $p[\theta|S]$. But you do not know $\theta$, you only have beliefs about what it might be. However, if you knew $\theta$ and $x$, it would be sufficient to predict the density of $y$, given by $p[y|x,\theta]$.

So to overcome this lack of information, you're going to integrate over all possible values of $\theta$, weighting the "true" density of $y$ given $x$ and $\theta$ by the probability you think a given $\theta$ is true.

So your prediction of $y$ conditional on the data $S$, for a given $x$ in which you are interested, is $$ \underbrace{p[y|x,S]}_{\text{Density of $y$ conditional on $x$, given data $S$}} = \int_\theta \underbrace{p[y|x,\theta]}_{\text{True relationship between $y$, $x$, $\theta$}} \quad \underbrace{p[\theta|S]}_{\text{Probablility $\theta$ is true given the data}} d\theta $$

So you are integrating out your uncertainty over $\theta$ using your data-based beliefs.