Sampling Distribution Notations

1.1k Views Asked by At

I'm reading a chapter on sampling distributions of a statistic and I don't seem to have an understanding of the notations used.

From probability theory, a random variable is usually denoted by a capital letter, say $X$. All values that $X$ can take are denoted by small letters. In this case, $x_1, x_2, ...$. It is easy to understand what terms like expected value and variance means because $X$ is a set of numerical values.

Now, we come to the idea of a sampling distribution. Here is an excerpt from the book I'm using:

When a simple random sample of numerical values is drawn from a population, each item in the sample can be thought of as a random variable. A random sample of size $n$ consists of the random variables $X_1, X_2, ...,X_n$ that may be treated and independent random variables, all with the same distribution.

Here are the things that I don't understand:

*If the sample is of size $n$, it means that every random variable, $X_i$, represents just a single value. So, what does it mean that $X_j$ has the same distribution when all it represents is a single value ?

This is first example which was given: The temperature of a random sample of five days are : 10, 20, 30, 40, 50. Calculate the value of the statistic: $\bar{X}$. Here's what they did: $$\bar{x}=\frac{\sum{x}}{n}=\frac{10+20+30+40+50}{5}$$

As you can see, they made of of lower case letters here, which made me even more confused. They even gave an explanation for it : "We use $\bar{x}$ for the mean value and $\bar{X}$ for the mean of the statistic." I don't even know if that is supposed to explain why they switched to a lower case.

Anyway, I hope I have been able to clearly describe my problem and I hope someone can help in providing a clarification.

1

There are 1 best solutions below

6
On BEST ANSWER

Yes, there is a subtle difference between the two.

$X_1,X_2,\dotsc, X_n$ are still random, they have not be set/defined/drawn etc. Since they are still random variables, then $$X_1\overset d= X_2 \overset d=\dotsb \overset d = X_n = X\overset d= X\sim G$$ for some arbitrary distribution $G$.

Suppose $E[X] = \mu$. Notice that since each $X_i$ is random, then the arithmetic mean $$\bar X = \frac{X_1+\dotsb+X_n}{n}$$ is also random. We can also compute its expectation, $$E[\bar X] = E\left[\frac{X_1+\dotsb+X_n}{n}\right] = \frac{1}{n}\bigg[E[X_1]+\dotsb+E[X_n]\bigg] = \frac{1}{n}\cdot n\cdot\mu = \mu.\tag 1$$

Example.
Suppose $n = 3$ and $X\sim \text{unif}(0,1)$. Then you can see that using $(1)$, we have $E[X] = .5$ and so $$E[\bar X] = E\bigg[ \frac{X_1+X_2+X_3}{3}\bigg] = 0.5$$

Now, suppose we generate three values from a $\text{unif}(0,1)$ distribution and they are $x_1 = 0.1,\; x_2 = 0.2,\; x_3 = 0.3$. Notice now that the $x_i$ are not random, they have been drawn/fixed/defined etc. So, we notice that $$\bar x = \frac{0.1+0.2+0.3}{3} = 0.2$$

So, in the long run, we expect $\bar X$ to be about $0.5$. However, in our particular case, we ended up with a mean of $\bar x = 0.2$ for our sample.