When talking about various ways to model something probabilistically, many authors prefer to use random variables, instead of probability distributions. Of course, this difference is more of a point of view, than of actual mathematical substance - yet I'm very much interested in why the random variables point-of-view is assumed? Let me elaborate below on this.
It seems to me that this comes from not being fully explicit and formal, when building your model - since if you would be, that you would see that using random variables is actually very artificial and using the probability distribution is actually much more natural.
Consider the following problem:
Suppose we have a vector $x\in\mathbb{R}^{p}$ that we interpret as
the visible attributes of individual. For example, $x$ might represent
a loan applicants age, gender, race, and credit history.
We consider the problem of modeling whether we should give a person
represented by $x$ a loan; let $y\in\{0,1\}$ represent the target
of this prediction, i.e. whether an individual will have defaulted
on a loan he received ($y=0$) or repaid it according to his contract
($y=1$).
To formalize this problem, we can define random variables $X$ and $Y$
that take on values $X=x$ and $Y=y$ for an individual drawn randomly
from the population of interest (e.g., the population of ).
We define the true risk
\begin{equation}
r(x)=Pr(Y=1|X=x)\ \ (1).
\end{equation}
Then the problem is how to estimate this risk from data, yadda, yadda.
The issue I mention above is related to the formulation (not the solution or theoretical framework) of this problem. Usually the above description is all that you get!
Let us investigate how we can make it even more precise:
If we begin to be more explicit, in order to even introduce random
variables $X,Y$ we need a sample space. Because these random variables
appear in the expression (1), which explicitly is
$$
r(x)=Pr(\{\omega\in\Omega:Y(\omega)=1\}|\{\omega\in\Omega:X(\omega)=x\}),
$$
the random variables furthermore need to be defined on the same sample
space. We could pick $\Omega:=\mathbb{R}^{p}\times\{0,1\}$ as a suitable
candidate, where a distribution $\mathcal{D}$ on it models how likely
it is that a certain individual is drawn from it. We could then define
$X:\Omega\rightarrow\mathbb{R}^{p}$ as the projection onto the first
$p$ components and $Y:\Omega\rightarrow\{0,1\}$ as the projection
onto the last component. By doing so, we have given (1) a concrete
meaning.
But defining the random variables like this is rather cumbersome; since we already needed to introduce $\Omega$ and $\mathcal{D}$ to even talk about random variables, we could just use these two ingredients to define the true risk by \begin{equation} r(x)=Pr(\{\omega\in\Omega:\omega_{p+1}=1\}|\{\omega\in\Omega:\omega_{1,\ldots,p}=x\}) \ \ (2), \end{equation}
where subscripts indicate the $p$-th coordinate.
But somehow a formulation as in (2) is very rarely used. My question is: Why does the community tend to prefer a vague way of defining random variables, that, if made precise, is actually more tedious to set up(as I have just shown) than using the formulation (2) ?
Using the probability space might seem more natural, but random variables are more elegant because usually we do not care about the probability space. Yes, in real applications, the probability space is relatively straightforward to point out, but it's not actually important. There is some quantity we care about, or multiple quantities we care about, and they are somehow dependent on each other or they aren't. And it's these quantities and their interplay we really care about, so why not do the theoretical groundwork with a focus on those quantities - random variables.
Another reason is that random variables give us an elegant method to describe events. Any event can be described as the preimage of a usually simple set under a random variable, and then knowledge about the random variable translates to knowledge about the event. Especially (in)dependence of events can be elegantly treated with (in)dependent random variables.