I have encountered this a variation of this problem as a test task for a job interview. It's been long since I last encountered probability theory, so I couldn't solve it. Still, it deprived me of inner peace, which is why I seek your help.
Consider a standard normal distribution D with mean $0$ and variance $1$. There were several subproblems to be solved.
1) Take two samples from D, 20 elements each: $\{X_1, X_2, ..., X_{20}\}$ and $\{Y_1,Y_2,...,Y_{20}\}$. Compute sample means $\bar{X}$ and $\bar{Y}$. Find the variance of $W = \bar{X} - \bar{Y}$.
Well, this is something I seem to have managed to do. So, I assumed all $X$'s and $Y$'s to be independent and identically distributed with $N(0;1)$. Which is why $\bar{X}$ and $\bar{Y}$ should also be independent, and
$$Var(W) = Var(\bar{X} - \bar{Y}) = Var(\bar{X})+Var(\bar{Y})=\frac{\sigma^2}{n}+\frac{\sigma^2}{n}=\frac{1}{20}+\frac{1}{20}=\frac{1}{10}.$$
However, what's next is a complete puzzle to me:
2) Take two samples from D, 40 elements each. From each of these two samples respectively, randomly pick 20 elements without repetition, creating subsamples $X^*_1 ... X^*_{20}$ and $Y^*_1 ... Y^*_{20}$. In these subsamples, compute means $\bar{X^*}$ and $\bar{Y^*}$. The task is to find $Var(\bar{X^*}-\bar{Y^*})$.
3) Take two samples from D, 40 elements each. From each of these two samples respectively, randomly pick 20 elements, repetition allowed, creating subsamples $X^*_1 ... X^*_{20}$ and $Y^*_1 ... Y^*_{20}$. In these subsamples, compute means $\bar{X^*}$ and $\bar{Y^*}$. The task is to find $Var(\bar{X^*}-\bar{Y^*})$.
These "layers" of sampling drove me crazy. Please give me a meaningful piece of explanation!
For $(2)$ the process of collecting $40$ observations and then throwing $20$ of them away is really the same experiment as just taking $20$ in the first place, exactly as in $(1)$. So the answer is $1/10$.
Repetition isn't quite so easy. Here is some R code, it looks like it is converging to about $.147735$
list <- 1:1000000
for (i in 1:1000000){ $\\$ sample1<- rnorm(40, 0, 1) $\\$ sample2 <- rnorm(40,0,1) $\\$
subsample1 <- sample1[sample(1:40, 20, replace = TRUE)] $\\$
subsample2 <- sample2[sample(1:40, 20, replace = TRUE)] $\\$
mean1 <- mean(subsample1) $\\$
mean2 <- mean(subsample2)
var <- mean2 - mean1 list[i] = var }
var(list)