I am curious to know which statement is correct:
- a random variable come from a probability distribution OR
- a probability distribution is created from observing the behavior of a random variable
I am curious to know which statement is correct:
On
Interesting question! The answer is that it can go both ways.
For any probability distribution F, i.e. a cadlag (right continuous with left limits) function $F: (-\infty, \infty) \to [0,1]$ satisfying
\begin{equation} \lim_{x \to -\infty} F(x) = 0, \lim_{x \to \infty} F(x) = 1, \end{equation}
there exists a random variable $X$ with $\mathbb{P}(X \leq x) = F(x)$. Specifically, this means there exists a probability space $(\Omega, \mathcal{A}, \mathbb{P})$, and a measurable function $X: \Omega \to \mathbb{R}$ with $\mathbb{P}(\{\omega: X(\omega) \leq x\}) = F(x)$ for every $x \in \mathbb{R}$. Depending on the distribution function $F$, there may be many possible probability spaces that work, but usually $\Omega$ just sits in the background, and we don't think much about it. (It turns out that the space $\Omega = [0,1]$ with the Borel sigma-field and Lebesgue measure always works, so you can assume that's where $X$ lives if you like.)
Conversely, if $X$ is a measurable map from a probability space $\Omega$ into $\mathbb{R}$, then the function $F_X: \mathbb{R} \to [0,1]$ is called the distribution function of $X$, defined as $F_X(x) = \mathbb{P}(\{\omega: X(\omega) \leq x\})$. One can use the axioms of a probability space to show that $F_X$ satisfies all the properties mentioned above.
Philosophically, what is happening? Roughly speaking, a statistician would like the first definition. The idea is: we observe data from some process, and form an empirical distribution from it. We assume there is a random variable $X$ generating the data, and we can try to understand how the empirical distribution relates to the true distribution of $X$.
For a mathematician, the other order is more typical. We define processes or variables abstractly using random variables on probability spaces, and prove theorems about them, or about the corresponding distribution functions.
Any random variable is a real-measurable function on the given space, equipped with some $\sigma$-algebra. It induces the probability mass function, in case the image is a discrete set, etc.