Bayesian Predictive Distribution Simulation

74 Views Asked by At

I am trying to understand why I can simulate from the predictive distribution of a regression:

$$\mathbb{P}(\overset{\sim}{Y}|Y)=\int f(\overset{\sim}{Y}|\sigma, \beta)\mathbb{P}(\sigma,\beta|Y)$$

by simulating from the posterior and then sampling from $f(\overset{\sim}{Y}|\sigma, \beta)$. The only way I can rationalize this is by thinking of it as a mixture model with the distribution: $$\mathbb{P}(\overset{\sim}{Y}|Y)\approx \sum f(\overset{\sim}{Y}|\sigma, \beta)\mathbb{P}(d\sigma,d\beta|Y)$$

where the sum is taken over a partition of the support, and sampling from posterior predictive is equivalent to sampling from $f(\overset{\sim}{Y}|\sigma, \beta)$ with probability $\mathbb{P}(d\sigma,d\beta|Y).$ Can someone elucidate why this works?

EDIT: Y is the sampling data and $\overset{\sim}{Y}$ is a new observation that is independent of your sample. The assumption is that $Y=X\beta +\epsilon$ where a prior is but on both the $\beta$ and the variance parameter $\sigma$.

1

There are 1 best solutions below

0
On

To calculate a general integral $\int g(x) f(x) dx$ (where $f$ is a distribution that we can sample from), you sample $x_1, \dots, x_n \sim f$, and approximate $f(x)$ with $$\frac{1}{n} \sum_{i=1}^n \delta_{x_i}(x).$$ To be clear, the $\delta$-mixture function means that we are approximating $X \sim f$ with $$\mathbb{P}(X = x) = \frac{1}{n} |\{x_i\in \{x_1, \dots, x_n\} : x_i = x\}|.$$ And so the integral is approximated by $$\frac{1}{n} \sum_{i=1}^n g(x_i).$$

In your case, the posterior distribution of $\sigma, \beta$ is approximated by a mixture of Dirac-deltas at $(\sigma_1, \beta_1), \dots, (\sigma_n, \beta_n)$, and then the posterior predictive distribution of $\overset{\sim}{Y}$ given $Y$ is approximated by the mixture $$\frac{1}{n} \sum_{i=1}^n f(\overset{\sim}{Y} \mid \sigma_i, \beta_i).$$

A couple of links you have probably already seen, although the second one doesn't mention Dirac deltas (I would say that you don't have to think of MC integration in terms of Dirac deltas but I find it the most useful way):

But Monte Carlo Statistical Methods is probably the best reference on the subject.