Derive the Normal-Normal conjugate relationship for the Normal mean $\mu$

347 Views Asked by At

I am to derive the Normal-Normal conjugate relationship for the Normal mean $\mu$ using the prior distribution $\mu \sim Normal(\mu_0, \tau_0)$.

I am assuming that the data takes the form of $n$ i.i.d draws $[y_1, y_2....,y_n]$ where each $y_i \sim Normal(\mu, \tau)$.

I just started my Bayesian class and am not sure of how to do this - would someone be able to explain how to do this?

1

There are 1 best solutions below

0
On BEST ANSWER

You have the model, (with assuming $\tau$ : denotes precision, i.e. the inverse of variance)

$$ \begin{aligned} y_1, \cdots, y_n | \mu, \tau &\sim Normal(\mu, \tau) \\[10pt] \mu &\sim Normal(\mu_0, \tau_0) \\ \end{aligned} $$

Since we do not know the prior distribution of $\tau$, the only available posterior is for $\mu | \tau, y_1, \cdots, y_n$, under the exchangeability assumption between $\mu$ and $\tau$.

So let's derive the posterior, (by denoting $\pi(\cdot)$ : distribution)

$$ \begin{aligned} \pi(\mu | \tau, y_1, \cdots, y_n) &\propto \pi(y_1, \cdots, y_n | \mu, \tau) \times \pi(\mu)\\[10pt] &\propto \prod_{i=1}^n \frac{\tau^{1/2}}{\sqrt{2\pi}} e^{-\frac{\tau}{2} (y_i - \mu)^2} \times \frac{\tau_0^{1/2}}{\sqrt{2\pi}} e^{-\frac{\tau_0}{2} (\mu - \mu_0)^2} \\[10pt] &\propto \tau^{n/2} e^{-\frac{n\tau + \tau_0}{2} (\mu - \frac{n\tau\mu + \tau_0 \mu_0}{n\tau + \tau_0})^2} e^{- \frac{\tau \sum y_i^2}{2} + \frac{(n\tau\mu + \tau_0 \mu_0)^2}{2(n\tau + \tau_0)}} \end{aligned} $$

As you can clearly see, this is not that simple form.


If you do not know the prior distribution of $\tau$, there is insufficient information about the model. In that case, we usually set non-conjugate prior for $\tau \sim \Gamma(\nu_0/2, \sigma_0^2 \nu_0/2)$, or more nicely, we usually condition $\mu$ on $\tau$, and set some hyperparameters $\nu_0, \sigma_0^2$ for $\tau$, and effective sample size $\kappa_0$. It is well-known modeling for normal with both mean and variance are unknown.

$$ \begin{aligned} y_1, \cdots, y_n | \mu, \tau &\sim Normal(\mu, \tau) \\[10pt] \mu | \tau &\sim Normal(\mu_0, \tau^{-1}/\kappa_0) \\[10pt] \tau &\sim \Gamma(\nu_0/2, \sigma_0^2 \nu_0/2) \end{aligned} $$

By marginalizing $\tau$, we have $\mu \sim t_{\nu_0} (\mu_0, \sigma_0/\sqrt{\kappa_0})$ : scaled-$t$ distribution.

In this model, we have posterior

$$ \pi(\mu, \tau) \sim \tau^{\frac{\nu_0 + n -1}{2}} e^{-\frac{\tau}{2} \left( (\kappa_0 + n) (\mu - \frac{\kappa_0 \mu_0 + n \bar{y}}{\kappa_0 + n})^2 + \frac{\kappa_0 n}{\kappa_0 + n} (\bar{y} - \mu_0)^2 + \sum (y_i - \bar{y})^2 + \nu_0 \sigma_0^2 \right)} $$

Looks intimidating, but by marginalizing out, and by defining $\kappa_n = \kappa_0 + n, \nu_n = \nu_0 + n, \mu_n = \frac{\kappa_0\mu_0 + n \bar{y}}{\kappa_0 + n}, \sigma_n^2 = \frac{1}{\nu_n} \left( \frac{\kappa_0n}{\kappa_n} (\bar{y} - \mu_0)^2 + \sum (y - \bar{y})^2 + \nu_n \sigma_n^2 \right)$ we have

$$ \begin{aligned} \pi(\mu | y_1, \cdots, y_n) &= \int \pi(\mu, \tau | y_1, \cdots, y_n) d \tau \\ &\propto \int \tau^{\frac{\nu_n -1}{2}} e^{-\frac{\tau}{2} \left[ \kappa_n (\mu - \mu_n)^2 + \sigma_n^2 \nu_n \right]} d \tau \\ &\propto \frac{1}{\left[ 1 + \frac{(\mu - \mu_n)^2}{(\sigma_n^2/\kappa_n) \nu_n} \right]^{\frac{\nu_n+1}{2}}} \end{aligned} $$

thus $\mu | y_1, \cdots, y_n \sim t_{\nu_n} (\mu_n, \sigma_n/\sqrt{\kappa_n})$. For $\tau | y_1, \cdots, y_n$, I'll leave it to you, and this gives $\Gamma(\nu_n/2)(\nu_n\sigma_n^2/2)$