Mathematical Stats: Find the expectation and variance of a population if 40% are X and 60% are Y.

67 Views Asked by At

I am familiar with how to use the linearity of the expectation and variance of random variables, however, I have this problem where I don't understand how to represent it by random variables in the first place. The problem states that 40% of a given country is urban and 60% is rural. The urbanites have a mean income of 5 with a variance of 2, and the ruralites have a mean income of 4 with a variance of 3. What are the mean and variance of the population?

It's tempting to thoughtlessly model this as $.4X + .6Y$ but that doesn't really make sense since I couldn't tell you what $X$ is here. However, it does make sense that the answer should be the weighted average $.4\cdot 5+.6\cdot 3$ which is what you'd get if you did it that way.

I'm now hypothesizing that maybe I shouldn't be modeling this by two random variables at all. Perhaps I should just use the definitions of the mean and variance, and argue that $\mu = P(X=rural)(5)+P(X=urban)(3)$ and likewise for the variance? But 5 isn't really the value associated with $X=rural$ it's just the mean of that variable.

Any clarification would be appreciated.

[Edit, it now seems to me this is basically looking for the "pooled variance", for which there is a formula. ]

2

There are 2 best solutions below

0
On BEST ANSWER

Consider three random variables: the income $X$ of typical urban person, the income $Y$ of typical rural resident, and the Bernoulli random variable $Z$ which is independent of $X$ and $Y$ and distributed as $\mathbb P(Z=1)=0.4$, $\mathbb P(Z=0)=0.6$.

Then $ZX+(1-Z)Y$ represents an income of randomly chosen person. This is the same as to chose $X$ with probability $0.4$ and $Y$ with probability $0.6$.

The mean of this value is $$ \mathbb E[ZX+(1-Z)Y] = \mathbb E[Z]\mathbb E[X]+\mathbb E[1-Z]\mathbb E[Y]=.4\cdot 5+.6\cdot 3. $$ To find the variance first obtain second moment. Note that $Z(1-Z)=0$ and $Z^2=Z$, $(1-Z)^2=1-Z$. Then $$ \mathbb E[(ZX+(1-Z)Y)^2]=\mathbb E[ZX^2+(1-Z)Y^2]=.4\mathbb E[X^2]+0.6\mathbb E[Y^2] $$ Here $$ \mathbb E[X^2] = \text{Var}(X)+\left(\mathbb E[X]\right)^2 = 2+25=27 $$ $$ \mathbb E[Y^2] = \text{Var}(Y)+\left(\mathbb E[Y]\right)^2 = 3+16=19 $$ Finally find the variance as a difference of second moment and square of mean.

2
On

You are right about the mean of the new set (the whole population). The mean is the expectation as you have it. The variance on the other hand depends very much on the distribution of the 2 subsets.

Even with a known distribution, say normal, computing the new variance would be very difficult and I would probably employ simulation to do it.