Computing mean and variance of a compound normal distribution with $\mu$ and $\sigma^2$ being random

131 Views Asked by At

Given the compound probability density $Y\sim\mathcal N(\mu,\sigma^2)$ with $\mu\sim F_\mu$ and $\sigma^2\sim F_{\sigma^2}$ how do we evaluate $\mathsf E(Y)$ and $\mathsf{Var}(Y)$ in terms of $\mathsf E(\mu)$, $\mathsf E(\sigma^2)$, $\mathsf{Var}(\mu)$, and $\mathsf{Var}(\sigma^2)$?

If we had only one random parameter this would be a direct application of the law of total expectation/variance. For example, suppose $\mu$ is fixed and only $\sigma^2$ is random. Then $$ \mathsf E(Y)=\mathsf E(\mathsf E(Y|\sigma^2))=\mathsf E(\mu)=\mu $$ and $$ \mathsf{Var}(Y)=\mathsf E(\mathsf{Var}(Y|\sigma^2))+\mathsf{Var}(\mathsf E(Y|\sigma^2))=\mathsf E(\sigma^2)+\mathsf{Var}(\mu)=\mathsf E(\sigma^2). $$ I am not clear on how to generalize these results to multiple random parameters ($\mu$ and $\sigma^2$ both random and possibly dependent). Can someone please explain?

2

There are 2 best solutions below

0
On BEST ANSWER

I think another name for "compound distribution" is "hierarchical model." The hierarchical model is:

$$\begin{split}Y|\mu, \sigma^2 &\sim N(\mu, \sigma^2)\\ \mu &\sim F_\mu\\ \sigma^2 &\sim F_{\sigma^2}\end{split}$$

If we assume that $\mu$ and $\sigma^2$ are independent, we can use the basic rules to get the marginal distribution of Y:

$$f(y)=\int _{-\infty}^\infty f(y|\mu,\sigma^2)f(\mu)f(\sigma^2)d\mu d\sigma^2$$

or, if they are not independent, assume WLOG that we know $f(\mu|\sigma)$ and $f(\sigma)$, then

$$\begin{split}Y|\mu, \sigma^2 &\sim N(\mu, \sigma^2)\\ \mu|\sigma &\sim F_{\mu|\sigma}\\ \sigma^2 &\sim F_{\sigma^2}\end{split}$$

$$f(y)=\int _{-\infty}^\infty f(y|\mu,\sigma^2)f(\mu|\sigma^2)f(\sigma^2)d\mu d\sigma^2$$

And proceed to calculate the mean and variance as usual. Alternatively, we can use the rules of conditional probability to directly obtain:

$$\begin{split}E(Y)&= \int y f(y) dy\\ &= \int y \int \int f(y|\mu, \sigma^2) f(\mu|\sigma^2) f(\sigma^2) d\mu d\sigma^2 dy\\ &= \int \int \int y f(y|\mu, \sigma^2) dy f(\mu|\sigma^2) f(\sigma^2) d\mu d\sigma^2\\ &= \int \int E(Y|\mu, \sigma^2) f(\mu|\sigma^2) f(\sigma^2) d\mu d\sigma^2\\ &= \int E(E(Y|\mu, \sigma^2)|\sigma^2) f(\sigma^2) d\sigma^2\\ &=E(E(E(Y|\mu, \sigma^2)|\sigma^2))\\ &=E(E(\mu|\sigma^2))\end{split}$$

If $\mu$ and $\sigma^2$ are independent, this leads to $E(E(\mu|\sigma^2))=E(E(\mu))=E(\mu)$, the mean of $F_\mu$. In terms of the variance, the wikipedia page for Law of total variance has the formula:

$$\begin{split}Var(Y)&=E(Var(Y|\mu, \sigma^2)) + E(Var(E(Y|\mu, \sigma^2)|\mu)) + Var(E(Y|\mu))\\ &= E(\sigma^2) + E(Var(\mu|\mu))+Var(\mu)\\ &=E(\sigma^2)+Var(\mu)\end{split} $$

0
On

You can use the law of total expectation conditioning on a vector too. The problem is with the usage of the law when we have one random variable. The statement $E[Y|X]$ is in fact $E[Y|\sigma(X)]$ where $\sigma(X)$ is the sigma algebra generated by $X$. The concept is much more abstract than the regular usage, which I won't dwell on here.

Let us denote as $F$ the unconditional distribution of $Y$, by $N$ the conditional distribution of $Y|\mu,\sigma$, and $G$ the joint distribution of $(\mu, \sigma)$. Then, $$ E_F[Y] = E_G[E_N[Y|\mu,\sigma]] = E_G[\mu] $$ which you can prove that is equal to $E_{F_\mu}[\mu]$, which is the expectation of $\mu$. You can try to write this more explicitly through the representation $E[Y|\mu = m,\sigma = s] = m$ and then integrating through the joint density $g$ of $\mu, \sigma$ as follows: $$ \begin{aligned} E_F[Y] &= \int_{m} \int_s E[Y|\mu = m, \sigma = s] g(m,s)dsdm\\ &=\int_{m}\int_s m g(m,s)ds dm\\ &= \int_m m \int_s g(m,s) ds dm\\ &= \int_m m f_\mu (m) dm \\ &= E_{F_\mu}[\mu]. \end{aligned}$$

The same happens with the variance (omitting indexes for the respective distributions): $$ \begin{aligned}Var(Y) &= E[Y^2]-E[Y]^2 \\&= E[E[Y^2|\mu,\sigma]]-E[E[Y|\mu,\sigma]]^2\\ &= E[\sigma^2+\mu^2]-E[\mu]^2 \\&= E[\sigma^2]+E[\mu^2]-E[\mu]^2 \\ &=E[\sigma^2]+Var(\mu). \end{aligned} $$