Difference between random variable and data with measure-theoretic formulation

149 Views Asked by At

Let's suppose we have some process generating data and we get IID observations $x_1, \dots, x_n$. We know that a random variable is a Borel function $X : \Omega \to \mathbb R$. Is it correct to say that each of these $x_i$ is equal to $X(\omega)$ for various $\omega \in \Omega$, and the same function $X$?

Furthermore, if we say we have random variables $X_1, \dots, X_n$, then each of these is a possibly different function from $\Omega$ to $\mathbb R$, right? But if we then specify that they're identically distributed, we have $n$ copies of the same function, so if we then observe corresponding data $x_1, \dots, x_n$ we can treat $x_i = X_i(\omega_i)$, i.e. we observe each function once, or since all $X_i$ are the same, we have $n$ observations on one function?

Sorry if this is unclear, but my inability to articulate this precisely is exactly why I'm asking this!

1

There are 1 best solutions below

5
On BEST ANSWER

To your first question:

You could view $x_i=X(\omega_i)$, if $x_i$ denotes results we have from the events in $\Omega$ (but "IID" usually does not apply). If we consider "drawing samples according to some distribution", we usually consider them as "IID" random variables that obeys a certain distrubition - i.i.d. is used for random variables. So if they mention $x_1,\ldots, x_n$ as "IID", they probably mean those observations to be different random variables.


To your second question:

Strictly speaking, "i.i.d" just means that those random variables are "independent and identically distributed", but they are not necessarily the same random variables. So what we know is the following:

  1. They are independent, i.e. events in the $\sigma$-algebra generated by one random variable is independent with events in the $\sigma$-algebra generated by another. Or more intuitively, their joint probability distribution could be decomposed into multiplication of their own distributions, e.g. $f_{X,Y}(x,y)=f_X(x)\cdot f_Y(y)$ for continuous case.
  2. They have the same probability distribution. But notice that does not mean that the random variables need to be the same. e.g. let $X\sim U_{[0,1]}$, $X$ and $1-X$ have the same distribution, but they are different r.v. Another example could be $\sin(Y)$ and $\cos(Y)$, where $Y \sim U_{[0,2\pi]}$. Further, if $X$ extends a bit beyond of Borel-measurable, say, Lesbegue-measurable instead, then we might get two random variables to have the same distribution function but not same probability for every event.