Why is the sample mean a random variable?

482 Views Asked by At

I know that a random variable is a measurable function from some measurable space $(\Omega, \mathscr{F})$ to some borel space $(R, \mathscr{B})$.

Suppose I am given some random sample $X_1, ..., X_n$. From the definition of a random sample, each $X_i:\Omega \rightarrow R$ are a measurable function with common domain the sample space $\Omega$.

I know that the sample mean is a random variable. Hence it is a measurable function.

So what type of function is the sample mean exactly? From which measurable space to where? And what exactly does the notation $\bar{X} = n^-1 \sum _i^n X_i$ mean given that it does not mean that $\bar{X} (s)= n^{-1} \sum _i^n X_i(s)$?

I ask because usually defining a function $f$ by $f = v+g$ where $g:A \rightarrow R, v:A \rightarrow R$ implies that $f(x) = v(x) + g(x) \forall x \in A$ holds.

1

There are 1 best solutions below

0
On BEST ANSWER

With this setup the sample mean is another measurable function $\Omega \to \mathbb{R}$ and it is just given by $\bar{X}(s) = \frac{1}{n} \sum_{i=1}^n X_i(s)$. The entire subtlety of this question, which you've glossed over, is how one actually defines the sample space $\Omega$ and the functions $X_i$ in it in general!

For example, suppose I want the $X_i$ to be $n$ independent samples from a normal distribution $N(\mu, \sigma)$. What is the sample space? It is not the sample space $\mathbb{R}$ of a single sample from a normal distribution. In fact it is $\mathbb{R}^n$, the product of $n$ copies of the sample space of a single sample, equipped with the product measure, and the $X_i$ are the $n$ coordinate projections $\mathbb{R}^n \to \mathbb{R}$. This construction is how we guarantee independence. So the sample mean is again another function $\mathbb{R}^n \to \mathbb{R}$ given by the mean of the $n$ coordinates.

Generally - and this is a surprisingly subtle point I've only seen explained well by Terence Tao, here and here - thinking of random variables as measurable functions on a fixed sample space is something of a distraction, because in probability theory we always retain the freedom to enlarge the sample space as necessary to accommodate new sources of randomness (e.g. in this case adding another sample $X_{n+1}$). As Tao says:

At a purely formal level, one could call probability theory the study of measure spaces with total measure one, but that would be like calling number theory the study of strings of digits which terminate. At a practical level, the opposite is true: just as number theorists study concepts (e.g. primality) that have the same meaning in every numeral system that models the natural numbers, we shall see that probability theorists study concepts (e.g. independence) that have the same meaning in every measure space that models a family of events or random variables [emphasis mine]. And indeed, just as the natural numbers can be defined abstractly without reference to any numeral system (e.g. by the Peano axioms), core concepts of probability theory, such as random variables, can also be defined abstractly, without explicit mention of a measure space [emphasis mine]; we will return to this point when we discuss free probability later in this course.

And:

In order to have the freedom to perform extensions every time we need to introduce a new source of randomness, we will try to adhere to the following important dogma: probability theory is only “allowed” to study concepts and perform operations which are preserved with respect to extension of the underlying sample space. (This is analogous to how differential geometry is only “allowed” to study concepts and perform operations that are preserved with respect to coordinate change, or how graph theory is only “allowed” to study concepts and perform operations that are preserved with respect to relabeling of the vertices, etc..)

As long as one is adhering strictly to this dogma, one can insert as many new sources of randomness (or reorganise existing sources of randomness) as one pleases; but if one deviates from this dogma and uses specific properties of a single sample space, then one has left the category of probability theory and must now take care when doing any subsequent operation that could alter that sample space. This dogma is an important aspect of the probabilistic way of thinking, much as the insistence on studying concepts and performing operations that are invariant with respect to coordinate changes or other symmetries is an important aspect of the modern geometric way of thinking. With this probabilistic viewpoint, we shall soon see the sample space essentially disappear from view altogether [emphasis mine], after a few foundational issues are dispensed with.

The sample space and measurable function are analogous to a representation of a group; it's a "representation" of a random variable, not the invariant content of the random variable itself (given by its CDF or its moments, for example). This can be made precise in various ways using algebras of random variables; see e.g. this old blog post of mine on noncommutative probability.