Variance of a Population of Two Indpendent Random Variables

66 Views Asked by At

I have a question regarding a problem I'm looking at out of personal curiosity. Here is the basic setup of the problem:

There is a population that contains half of type A, and half of type B. The known distribution of outcomes for the whole population is N(50,7.5). The known distribution of the outcome for type A is N(50,6.5). Assuming the same mean for type B (also normally distributed), what must the variance of B be in order to get the population outcome?

I originally framed the problem as finding the variance of the average of the random variables:

$$Z = \frac{1}{2}(A + B)$$

Which gave me:

$$Var[Z] = \frac{1}{4}(Var[A] + Var[B])$$

Solving for the unknown $Var[B]$ gives an answer that does not check out during monte carlo simulation. What this says is that approximating sampling from a population that has two types in it can't be represented as the average of the distributions of the two types.

From trial and error, I found that the variance of B must be 8.38. This corresponds with the following:

$$Var[Z] = \frac{1}{2}Var[A] + \frac{1}{2}Var[B]$$

When I experiment with other ratios in the population, the population variance follows the rule of the weighted (by proportion in population) sum of type A and B variances.

What leads to this general rule? I have a hunch I've just forgotten a basic rule of populations and random variables. Thank you in advance for any help.

I think that I need to represent the problem as:

$$Z \sim \left\{\begin{align} &A \quad \mbox{draw from with probability }p\\ &B \quad \mbox{draw from with probability }1-p \end{align}\right.$$