I have what might be a silly question. I am tutoring a family member in stats and I am trying to help her understand the sampling distribution of sample means, but I don't quite understand why it's derived the way it is myself.
The way I understand this particular distribution is the distribution of sample means for all possible samples of size n. Her notes contain a simple example the teacher gave:
Suppose we have a small population: {2, 4, 6, 8}. Let's find all possible samples of size n = 2.
I would think all possible samples of size two would be:
{2, 4}, {2, 6}, {2, 8}, {4, 6}, {4, 8} and {6,8}
But according to the notes, there are 16 possible samples of size two, and they are taken with replacement. For example, one of the possible samples is {2, 2} and {2, 4} is counted as a distinct sample from {4, 2}. I don't understand why these are counted. The sampling distribution of sample means is used in hypothesis testing. Say you are testing a claim about the mean of adult male heights and you want to select a sample of size n = 10, you would not count the same individual 10 times. What am I missing?
Thank you.
A sample, by definition is a sequence of draws from the population. The sample $\{2,4\}$ implies that $2$ is drawn before $4$, whereas $\{4,2\}$ implies the opposite. In problems where the draws are not independent, the order matter as it has implication for the probability distribution of the remaining elements to be drawn.