I am reading the textbook "Mathematical Statistics" 2nd edition written by Jun Shao http://www.mim.ac.mw/books/Mathematical%20statistics%202nd%20edition.pdf .
In precise terms, what does the author mean by "$X$ is a sample from population $P \in \mathcal{P}$"? To phrase my question in another way, how is $X$ related to probability measure $P \in \mathcal{P}$? Lots of theorems are started with this sentence and no precise definition was found. I need to sort it out before moving on.
Here are the earliest related definitions I can find:
On page 91-92:
In statistical inference and decision theory, the data set is viewed as a realization or observation of a random element defined on a probability space $(\Omega, \mathcal{F}, P)$ related to the random experiment. The probability measure $P$ is called the population. The data set or the random element that produces the data is called a sample from $P$. A population $P$ is known if and only if $P(A)$ is a known value for every event $A \in \mathcal{F}$.
Later on page 92:
In statistical inference and decision theory, the data set, $(x_1, ..., x_n)$, is viewed as an outcome of the experiment whose sample space is $\Omega = \mathcal{R}^n$. We usually assume that the $n$ measurements are obtained in n independent trials of the experiment. Hence, we can define a random $n$-vector $X = (X_1, ..., X_n)$ on $\prod_{i=1}^n(\mathcal{R}, \mathcal{B},P)$ whose realization is $(x_1, ..., x_n)$. The population in this problem is $P$ (note that the product probability measure is determined by $P$) and is at least partially unknown.
But later, the author does not seem to be very certain about this. For example, on page 96, there is a sentence saying "$\Omega$ is usually $\mathcal{R}^k$. On page 100, the beginning of chapter 2.2 says "Let us assume now that our data set is a realization of a sample $X$ (a random vector) from an unknown population $P$ on a probability space."
The paragraph on page 92 seems to suggest that $X = (X_1, \cdots, X_n)$ has nothing to do with $P$: it just seems to be a random variable defined on a measurable space $\prod_{i=1}^n(\mathcal{R}, \mathcal{B})$. The way the sentence in question constructed seems to suggest that $X$ and $P$ are somehow related: and they should be related as many important theorems (like theorem 2.2 on page 104) all start with this sentence.
My guess to make sense of it:
Whenever the sentence "$X$ is a sample from population $P \in \mathcal{P}$" is mentioned, assume the following:
There is a probability space $(\Omega, \mathcal{F}, \mu)$ and $X = (X_1, \cdots, X_n): \Omega \to \mathcal{R}^n$ is a Borel function (i.e. $\mathcal{R}^n$ valued random variable or $n$-dimensional random vector).
$\mathcal{P}$ is a (given) collection of Borel probability measures on $\mathcal{R}$. The probability measure $\mu$ on $\mathcal{F}$ is on a different space and is a priori assumed to be unrelated to $\mathcal{P}$.
$X_1, \cdots, X_n$ are independent and identically distributed with common law $P$ and it just happens that $P \in \mathcal{P}$.
Is my guess correct ?
This is correct.
Once you know $\mu$ and $X$ you can define the law (or distribution) of $X$ (see p.8-9). The sentence "$X$ is a sample from population $P \in \mathcal{P}$" means that the law of $X$ is exactly the measure $P$ (i.e., $\mu_X=P$), and $P\in \mathcal P$.
This is correct.