Suppose we have two populations of people in different parts of the world and we want to talk about the variation in heights between the two populations. As I understand it, statisticians are talking about the variation of height in people randomly selected from each population. Suppose that everyone in one population is 5 feet tall and everyone in the other population is 6 feet tall. Would it be correct to say that the within population variation is 0 but the between population variation is 1 foot? Alternatively, if the mean and variance of the two populations were identical, would it be right to say that the between population variation is 0 and the within population variation is just the usual variance? Is there a simple general formula for determining the two values?
What is meant by statisticians when they talk about between population differences vs within population differences?
1.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
One way to compare within and between population differences is by using effect sizes. A popular measure for effect size is Cohen's $d$. Consider two samples with means $\mu_1$ and $\mu_2$, standard deviations $\sigma_1$ and $\sigma_2$, and sample sizes $n_1$ and $n_2$. Then Cohen's $d$ is given by $$d = \frac{\mu_1-\mu_2}{s}$$ where $s$ is the pooled standard deviation given by $$s = \sqrt{\frac{(n_1-1)\sigma_1^2+(n_2-1)\sigma^2_2}{n_1+n_2-2}}.$$ If there are no samples but the distribution parameters are known, then the pooled standard deviation is given by $$s = \sqrt{\sigma_1^2+\sigma_2^2}.$$ If the standard deviations are equal, i.e. $\sigma_1=\sigma_2$, then the pooled standard deviation is also equal, i.e. $\sigma_1=\sigma_2=s$. Hence, you can think of Cohen's $d$ to mean "how big is the difference between the population means in standard deviation units". If the magnitude of the $d$-value exceeds $1$, i.e. $|d|>1$, then the difference between populations can be said to be larger than the difference within populations.
If everyone is 5 feet tall in country A and 6 feet tall in country B, then each individual country has mean 5,6 height respectively, with zero variance. However if you want to compare the two, you need to know the population size. For example if there are $n_A,n_B$ people in both countries, then the mean height amongst both is: $m=\frac{5n_A+6n_B}{n_A+n_B}$. Similarly, the variance amoungst both countries needs to take into account respective population sizes: $\frac{1}{n_A+n_B}(n_A(5-m)^2+n_B(6-m)^2)$.
Think of it this way, if there's 1 person in country $A$ and a million in country $B$, then saying the population variation is 1 overall isn't really telling of the fact that there's such a population disparity. It's really only telling of extreme outliers.