A confusing excersice about Bayes' rule

544 Views Asked by At

The following is from a textbook one bayesian stats. that I can't understand some deduction. It is relevant about multiple parameters to be estimated.

The jth observation in the ith group is denoted by $y_{ij}$,
where

$$(y_{ij}|\mu_i,\sigma)\sim N(\mu_i,\sigma^2) \quad j=1,2, \dots, n_i \quad i= 1,2, \dots, m$$

Also the $y_{ij}$ are independent from each other.

Suppose $\mu_i \sim N(\mu,\tau^2)$ and denote
$$\theta= (\mu, \log(\sigma),\log(\tau))$$ $$Y=\{y_{ij}: j=1,\dots, n_i, i=1,\dots, n\}$$ $$Z=(\mu_1,\dots, \mu_m)$$ $$n=n_1+n_2+\cdots +n_m$$

So $\theta$ is the unknown parameters interested. Take its prior distribution as $p(\theta) \propto \tau$. Then by Bayes rule, it is not difficult to get the posterior distribution:

$$p(Z,\theta|Y) \propto p(\theta) \prod\limits_{i = 1}^m {p(\mu_i|\mu,\tau)} \prod\limits_{i = 1}^m \prod\limits_{j = 1}^{n_i} {p(y_{ij}|\mu_i,\sigma)}$$

This is the place I can't understand. How to get this formula if No.3 formula is not correct in this thread: I am confused about Bayes' rule in MCMC

Could someone explain it in detail? If there are any excellent books that could help me, please list them.

1

There are 1 best solutions below

9
On BEST ANSWER

AFAICT, the "trick" is that, by definition, $y_{ij}$ depends on $\mu$ and $\tau$ only via $\mu_i$. Thus, $$p(y_{ij}\mid\mu_i,\sigma) = p(y_{ij}\mid\mu_i,\sigma,\mu,\tau) = p(y_{ij}\mid\mu_i,\theta).$$

Similarly, $\mu_i$ does not depend on $\sigma$, so $$p(\mu_i\mid\mu,\tau) = p(\mu_i\mid\mu,\tau,\sigma) = p(\mu_i\mid\theta).$$ In particular, this means that we can rewrite your equation as

$$ \begin{aligned} p(Z,\theta\mid Y) \propto& p(\theta) \prod_{i = 1}^m p(\mu_i\mid\mu,\tau) \prod_{i = 1}^m \prod_{j = 1}^{n_i} p(y_{ij}\mid\mu_i,\sigma) \\ =& p(\theta) \prod_{i = 1}^m p(\mu_i\mid\sigma,\mu,\tau) \prod_{i = 1}^m \prod_{j = 1}^{n_i} p(y_{ij}\mid\mu_i,\sigma,\mu,\tau) \\ =& p(\theta) \prod_{i = 1}^m p(\mu_i\mid\theta) \prod_{i = 1}^m \prod_{j = 1}^{n_i} p(y_{ij}\mid\mu_i,\theta) \\ =& p(\theta)\, p(Z\mid\theta)\, p(Y\mid Z,\theta) \\ =& p(Y,Z,\theta) \\ =& p(Z,\theta\mid Y)\, p(Y). \end{aligned} $$