Using Law of Total Variance to find the Variance of Mixture Distribution

219 Views Asked by At

Consider a mixture of measures $Q$ given by $$ Q = \sum_{k=1}^{N} p_k D_k $$ where $D_k$ is a probability measure with known associated (finite) mean $\mu_k$ and variance $\sigma_k^2$ such that each $D_k$ independent of one another and $$ \sum_{k=1}^{N} p_k = 1 $$ Using the standard approach of taking expectations, it can easily be shown that the variance of this mixture of probability measures is given by $$ \text{Var}(Q) = \sum_{k=1}^{N} p_k (\sigma_k^2 + \mu_k^2) - \left( \sum_{k=1}^{N} p_k \mu_k \right)^2 $$ I am attempting to obtain this result using the Law of Total Variance, which states that for independent random variables $X$ and $Y$ $$ \text{Var}(Y) = \mathbb{E}\left[ \text{Var}\left( Y \vert X \right) \right] + \text{Var}\left( \mathbb{E}\left[ Y \vert X \right] \right) $$ but so far I have been failing miserably. My first thought was to write $$ \text{Var}(Q) = \mathbb{E}\left[ \text{Var}\left( \left. \sum_{k=1}^{N} p_k D_k \right\vert D_j \right) \right] + \text{Var}\left( \mathbb{E}\left[ \left. \sum_{k=1}^{N} p_k D_k \right\vert D_j \right] \right) $$ and then determine that $$ \text{Var}\left( \left. \sum_{k=1}^{N} p_k D_k \right\vert D_j \right) = \text{Var}\left( \sum_{k \neq j}^{N} p_k D_k \right) $$ and $$ \mathbb{E}\left[ \left. \sum_{k=1}^{N} p_k D_k \right\vert D_j \right] = p_j D_j + \sum_{k \neq j}^{N} p_k \mu_k $$ to finally get $$ \text{Var}(Q) = \mathbb{E}\left[ \text{Var}\left( \sum_{k \neq j}^{N} p_k D_k \right) \right] + p_j^2 \sigma_j^2 $$ I could then repeat this process to find $$ \text{Var}\left( \sum_{k \neq j}^{N} p_k D_k \right) $$ and have repeated expectations, but this would eventually result in $$ \text{Var}(Q) = \sum_{k=1}^{N} p_k^2 \sigma_k^2 $$ which is certainly incorrect. I think the issue with this attempt is that I am simply treating these probability measures $D_k$ as random variables to apply the form of the Law of Total Variance that I know, but this must be wrong. How may I apply the Law of Total Variance to a mixture of probability measures, or perhaps expand the idea of the Law of Total Variance?

1

There are 1 best solutions below

0
On BEST ANSWER

Let $Z$ be a random variable such that $\mathbb{P}(Z = k) = \pi_k$ and let $X \sim Q$. From this, we have $X \vert Z \sim \text{N}\left( \mu(Z), \sigma^2(Z) \right)$, which is to say $X \vert (Z = k) = \text{N}(\mu_k, \sigma_k^2)$. Applying the total variance identity, we have \begin{align*} \text{Var}(X) &= \mathbb{E}\left[ \text{Var}(X \vert Z) \right] + \text{Var}\left( \mathbb{E}[X \vert Z] \right) \\ &= \mathbb{E}\left[ \sigma^2(Z) \right] + \text{Var}\left( \mu(Z) \right) \\ &= \sum_{k=1}^{N} \pi_k \sigma_k^2 + \sum_{k=1}^{N} \pi_k (\mu_k - \overline{\mu})^2 \\ &= \sum_{k=1}^{N} \pi_k \left( \sigma_k^2 + \mu_k^2 \right) - \left( \sum_{k=1}^{N} \pi_k \mu_k \right)^2 \\ \end{align*} which is precisely the previously derived result.