Expectation and Covariance of Random Variable with Mixture Distribution

295 Views Asked by At

A random vector $\mathbf{X}$ has a $p$-variate normal mixture distribution when with probability $q$ it is drawn from $f_1(\mathbf{X}) \sim N_m(\mathbf{\mu}_1, \mathbf{\Sigma}_1)$ and with probability $1-q$ from $f_2(\mathbf{X}) \sim N_m(\mathbf{\mu}_2, \mathbf{\Sigma}_2)$. It has probability density function $f(\mathbf{X})=qf_1(\mathbf{X})+(1-q)f_2(\mathbf{X})$

(a) How many parameters are in this distribution?

(b) Find $E(\mathbf{X})$ and $\text{Covar}(\mathbf{X})$

$\mathbf{Proof}$

$\mu_1,\mu_2$ trivially both have $m$ components. For $\mathbf{\Sigma_1}, \mathbf{\Sigma_2}$ we need only count the upper triangle since they are each symmetric about the diagonal. The diagonal has $m$ components, the adjacent upper diagonal has $m-1$ components, and if you continue this counting process we have that each covariance matrix has $m+m-1+m-2 +\dots + 1=\frac{m^2-m}{2}+m$. So all together we have:

$$m+m+\Bigl(\frac{m^2-m}{2}+m\Bigr)+\Bigl(\frac{m^2-m}{2}+m\Bigr)=2m+m^2-m+2m=3m+m^2$$

parameters in this distribution.

$$E(\mathbf{X})=E(qf_1(\mathbf{X})+(1-q)f_2(\mathbf{X}))=E(qf_1(\mathbf{X}))+E((1-q)f_2(\mathbf{X}))=qE(f_1(\mathbf{X}))+(1-q)E(f_2(\mathbf{X}))=q\mathbf{\mu_1}+(1-q)\mathbf{\mu}_2$$

I feel fairly confident about the number of parameters piece of this problem but am unsure whether or not I have communicated the expectation operations correctly. How do you interpret $E(f_1(\mathbf{X}))$ if not the way I already have it? I could also use help understanding how to calculate the covariance. Is it correct to say that $\text{Covar}(f_1(\mathbf{X}))=\mathbf{\Sigma}_1$?

1

There are 1 best solutions below

1
On BEST ANSWER

Writing things like "$E[f_1(X)]$" is a bit problematic; you aren't ever really plugging in the random variable into the density function. But your computation leads to the right answer, even if the notation is not standard.


Let $Z$ be the random variable satisfying $P(Z=1)=q$ and $P(Z=2)=1-q$. Then the conditional distribution of $X$ given $Z=1$ is $N_m(\mu_1, \Sigma_1)$, and the conditional distribution of $X$ given $Z=2$ is $N_m(\mu_2, \Sigma_2)$.

Then, $E[X] = E[E[X \mid Z]] = E[X \mid Z=1] P(Z=1) + E[X \mid Z=2] P(Z=2) = q\mu_1 + (1-q) \mu_2$ which is what you got.

The law of total covariance applies to each entry of the matrix $\text{Cov}(X)$, and ultimately becomes $$\text{Cov}(X) = E[\text{Cov}(X \mid Z)] + \text{Cov}(E[X \mid Z]).$$ The first term is $q\Sigma_1 + (1-q) \Sigma_2$ by a computation similar to the expectation computation above. The other term is $$\text{Cov}(E[X \mid Z]) = E[E[X \mid Z]E[X \mid Z]^\top] - E[E[X \mid Z]]E[E[X \mid Z]]^\top = q \mu_1 \mu_1^\top + (1-q) \mu_2 \mu_2^\top - (q\mu_1 + (1-q) \mu_2)(q\mu_1 + (1-q) \mu_2)^\top$$ where I am just using the formula $\text{Cov}(Y) = E[YY^\top] - E[Y]E[Y]^\top$ with $Y=E[X \mid Z]$.