How do I find the posterior predictive distribution in Bayesian Analysis? I'm not looking for a specific case, I would like the general solution.
Predictive distribution in Bayesian Analysis
95 Views Asked by user117741 https://math.techqa.club/user/user117741/detail AtThere are 2 best solutions below
On
This answer has been moved from here
Consider firs that (very easy to prove)
$$\mathbb{P}[A,C|B]=\mathbb{P}[A|B,C]\cdot\mathbb{P}[C|B]$$
Now set
$A=\{y_{n+1}=\tilde{y}\}$
$B=\{Y_1=y_1,\dots,Y_n=y_n=\mathbf{y}\}$
$C=\theta$
and get
$$\mathbb{P}\{\tilde{y},\theta|\mathbf{y}\}=\mathbb{P}\{\tilde{y}|\theta,\mathbf{y}\}\cdot \mathbb{P}\{\theta|\mathbf{y}\}$$
Now to eliminate $\theta$ in LHS you have to integrate both sides w.r.t. $\theta$ and thus, considering the conditional independence (subordinated to $\theta$) of the observations, you get:
$$\mathbb{P}\{\tilde{y}|\mathbf{y}\}=\int_{\Theta}\mathbb{P}\{\tilde{y}|\theta\}\cdot \mathbb{P}\{\theta|\mathbf{y}\}d\theta$$
Where
$\mathbb{P}\{\tilde{y}|\theta\}$ is the statistical model
$\mathbb{P}\{\theta|\mathbf{y}\}$ is the posterior
Conditional independence of the observations given $\theta$
... in your case this means that
$$\mathbb{P}[\tilde{y}|\mathbf{y},\theta]=\mathbb{P}[\tilde{y}|\theta]$$
A simple example:
Let's suppose to have the following model (bernulli)
$$\theta^x(1-\theta)^{1-x}$$
with $x=0,1$ and $\theta \in (0;1)$
As per we said before, the probability of a "future success" given a certain number of previous results is
$$\mathbb{P}[\tilde{y}=1|\mathbf{y}]=\int_0^1 \theta p(\theta|\mathbf{y}) d\theta= \mathbb{E}[\theta|\mathbf{y}]$$
and similarly
$$\mathbb{P}[\tilde{y}=0|\mathbf{y}]=\int_0^1 (1-\theta) p(\theta|\mathbf{y}) d\theta= 1-\mathbb{E}[\theta|\mathbf{y}]$$

Say $\Theta$ is distributed as $p(\theta)\, d\theta,$ i.e. $$ \Pr(\Theta\in S) = \int\limits_S p(\theta)\,d\theta $$ and $$ X_1,\ldots,X_n \mid \Theta=\theta\sim\text{i.i.d. } f_\theta(x)\,dx $$ i.e. $$ \Pr(X_i\in A\mid \Theta=\theta) = \int\limits_A f_\theta(x)\,dx $$ and $X_1,\ldots,X_n$ are conditionally independent given $\Theta=\theta.$
Then the posterior distribution, i.e. the conditional distribution of $\Theta$ given $X_i=x_i$ for $i=1,\ldots,n,$ is $$ q(\theta)\, d\theta = \text{constant} \times p(\theta) f_\theta(x_1)\cdots f_\theta(x_n)\, d\theta $$ where the constant is so chosen that $\displaystyle \int q(\theta)\,d\theta$ (the integral being taken over the whole parameter space) is $1.$
Then the predictive distribution is given by $$ \Pr(X_{n+1}\in A \mid X_1=x_1,\ldots,X_n=x_n) = \int \left( \, \int\limits_A f_\theta(x)\,dx\right) q(\theta) \, d\theta, $$ the outer integral again being taken over the whole parameter space.