Question about normal approximation and variance

119 Views Asked by At

This isn't so much a question about getting a right answer as much as it's about understanding a mathematical concept, but I will give you the problem that spawned it:

An analysis of data shows that the annual income of a randomly chosen individual from country A has mean \$18000 and standard deviation of \$6000. And the annual income of a randomly chosen individual from country B has a mean $31000 an standard deviation of 8000. Find the approximate probability that the average income from B is at least 15000 larger than A.

I know how to get the answer to this question so, never mind that. My question comes into play when getting the variance for the normalizations of A and B.

For instance, why is the variance for A calculated $(1/100)^2$ times the summation of 100 times the variances? Likewise for B. The $1/100^2$ is throwing me off. Why are we multiplying by $1/100^2$?

3

There are 3 best solutions below

2
On

If $X$ is a random variable, the variance of $100X$ is $100^2$ times the variance of $X$. Likewise the variance of $X/100$ is $1/100^2$ times the variance of $X$.

I don't know what that has to do with your problem, but that is a common way for the square of a number to show up as a factor when computing a variance.

1
On

You seem to be missing some information when describing your question, and this missing information is critical to the reason behind your question.

Suppose I observe one individual from country $A$. Their income is a single normally distributed random variable with mean $\mu_A = 18000$ and standard deviation $\sigma_A = 6000$. If I observe two randomly selected individuals from this country, the total of their incomes is normally distributed with mean $\mu = 2\mu_A$ and standard deviation $\sigma = \sqrt{2} \sigma_A$. This follows from a nice property of the normal distribution: if $X_1 \sim {\rm Normal}(\mu_1, \sigma^2_1)$, $X_2 \sim {\rm Normal}(\mu_2, \sigma^2_2)$ are independent normal random variables with respective means $\mu_1, \mu_2$ and variances $\sigma^2_1, \sigma^2_2$, then $$X_1 + X_2 \sim {\rm Normal}(\mu_1 + \mu_2, \sigma^2_1 + \sigma^2_2),$$ that is to say, their sum is also normal and their means add, and their variances add. So it is not hard to see that in the general case of the sum of $n$ independent and identically distributed normal random variables, the distribution of the total is also normal with mean $n\mu$ and variance $n\sigma^2$, hence standard deviation $\sqrt{n}\sigma$.

That said, we want the sample mean, not the total, of $n$ observations. In your case $n = 100$ but this was not explicitly stated when you posed the question. So we want the distribution of $\bar X = \tfrac{1}{100}(X_1 + X_2 + \cdots + X_{100})$, where each $X_i$ is normal with mean $\mu_A = 18000$ and standard deviation $\sigma_A = 6000$. Since we also know that $${\rm Var}[cX] = c^2 {\rm Var}[X]$$ for any non-random constant $c$ (this is a straightforward consequence of the definition $${\rm Var}[X] = {\rm E}[(X - {\rm E}[X])^2],$$ and the linearity of expectation), it follows that $$\sigma_{\bar A}^2 = \frac{1}{n^2} \cdot n \sigma^2_A.$$

0
On

Answer:

B has N(31000,640000) and A has N(18000,360000).

$$P(B-A >=15000) = P(B-A) >=15000)$$

$$E(B-A) = E(B) - E(A)$$

$$Var(B+(-1)A) = Var(B) + (-1)^2Var(A) = Var(B)+Var(A)$$

$$E(B) = 31000 , E(A) = 18000, Var(A) = 36\times10^6, Var(B) = 64\times10^6$$

$$Var(B-A) = 100\times10^6$$

$$P(B-A>=15000) = P\left(Z>=\frac{15000 - 13000}{\sqrt{100\times10^6)}}\right)= P(Z>=\frac{2000}{10000})$$ $$= P(z>=.2) = 1-P(z<.2) = 1-(.57926) = 0.420745$$

The answer to your question is $VAR(cX) = c^2Var(X)$$