Dimensionality and functional form of the natural conjugate prior to the two-parameter Normal distribution

89 Views Asked by At

Exponential family distributions take the following form: $$ p(x | \boldsymbol{\eta}) = h(x) \exp\left(\boldsymbol{\eta} \mathbf{T}(x) - A(\boldsymbol{\eta}) \right) $$ where $\boldsymbol{\eta}$ are natural parameters, $\mathbf{T}$ are sufficient statistics, $A(\boldsymbol{\eta})$ is the normalizing constant.

The Normal distribution with mean $\mu$ and precision $\tau$ can be rewritten in the exponential family form by

$$ p(x | \mu, \tau) = \sqrt{\frac{\tau}{2\pi}} e^{-\frac{\tau}{2}(x - \mu)^2} = \frac{1}{\sqrt{2\pi}} e^{-\frac{\tau}{2}x^2 + \tau x \mu - \frac{\tau}{2}\mu^2 + \frac{1}{2} \ln \tau} $$ where $\eta_1 = \tau \mu, \eta_2 = -\frac{\tau}{2}, T_1(x) = x, T_2(x) = x^2$, and the normalizing constant $A(\mu, \tau) = \frac{\tau \mu^2}{2} - \frac{\ln \tau}{2}$.

Now, the natural conjugate prior to distributions of the exponential family take the following form

$$ p(\boldsymbol{\eta} | \boldsymbol{\lambda}, \nu) = \exp \left(\boldsymbol{\lambda} \boldsymbol{\eta} - \nu A(\boldsymbol{\eta}) - f(\boldsymbol{\lambda}, \nu) \right) \propto \exp(\lambda_1 \tau \mu - \lambda_2 \frac{\tau}{2} - \nu(\frac{\tau \mu^2}{2} - \frac{\ln \tau}{2})) $$

Question 1: Does this imply that the conjugate prior necessarily has dim$(\boldsymbol{\eta}) + 1$ (three) parameters?

We know that the conjugate prior to the Normal distribution parametrized via the precision is the Normal-Gamma distribution (four! parameters)

\begin{align} p(\mu, \tau | \mu_0, \kappa_0, \alpha_0, \beta_0) & \propto \tau^{\alpha_0 - \frac{1}{2}} \exp(-\beta_0 \tau) \exp(-\frac{\tau \kappa_0(\mu - \mu_0)^2}{2}) \\ & = \exp((\alpha_0 - \frac{1}{2}) \ln \tau - \beta_0 \tau - \frac{\tau \kappa_0 \mu^2}{2} + \tau \kappa_0 \mu \mu_0 - \frac{\tau \kappa_0 \mu^2_0}{2}) \end{align} Implying $$ \lambda_1 = \kappa_0 \mu_0 \hspace{5mm} \lambda_2 = \kappa_0 \mu^2_0 + 2 \beta_0 \hspace{5mm} \nu = \kappa_0 \hspace{5mm} \text{and} \hspace{5mm} \nu = 2 \alpha_0 - 1 $$ Question2: Is this derivation correct? Why do I get $\kappa_0 = 2 \alpha_0 - 1$?

1

There are 1 best solutions below

0
On BEST ANSWER

If you use the same $\boldsymbol{\eta}$ for both the likelihood $p(x \mid \boldsymbol{\eta})$ and the prior $p(\boldsymbol{\eta} \mid \boldsymbol{\lambda}, \nu)$, and $A(\boldsymbol{\eta}) \neq 0$ so we can multiply $A(\boldsymbol{\eta})$ by a constant $\nu$, then I can't see a problem with your argument. Your derivation of the parameters is correct.

For the normal-gamma setting the number of observations in the likelihood, $n$, belongs in $\boldsymbol{T}(x)$. [Observing $\sum x_i = 0, \sum x_i^2 = 1$ would describe completely different data sets if $n=2$ or $n=100$].

The likelihood of observations $x_1, \dots, x_n$ is

$$p(x_1, \dots, x_n \mid \mu, \tau) \propto \exp\left\{\frac{n}{2}\ln \tau - \frac{\tau}{2} \sum x_i^2 + \tau \mu \sum x_i - \frac{n}{2} \tau \mu^2 \right\}.$$

If we don't accept $n$ as a constant, then we see that the likelihood is written as

$$p(x_1, \dots, x_n \mid \mu, \tau) = h(x) \exp\left\{ \boldsymbol{\eta} \cdot \boldsymbol{T}(x) - A(\boldsymbol{\eta}) \right\},$$

where

$$h(x) = (2\pi)^{-n/2}, \boldsymbol{\eta} = \left(\ln \tau, \tau, \tau \mu, \tau \mu^2\right)^T, \boldsymbol{T}(x) = \left( \frac{n}{2}, -\frac{1}{2}\sum x_i^2, \sum x_i, -\frac{n}{2} \right)^T, A(\boldsymbol{\eta}) = 0.$$

Since $A(\boldsymbol{\eta}) = 0$ we have no need of $\nu$ in the general form of the prior

$$p(\boldsymbol{\eta} \mid \boldsymbol{\lambda}, \nu) \propto \exp\left\{ \boldsymbol{\lambda} \boldsymbol{\eta} - \nu A(\boldsymbol{\eta}) \right\}$$

so the conjugate prior has $\textrm{dim}(\boldsymbol{\eta}) = 4$ parameters.