Idea of a Random Variable

93 Views Asked by At


Two weeks into my class, I am still struggling with the idea of a random variable.
I can see why random variables make sense when the outcomes are numbers,
like monetary gains and losses.

But if the outcomes are not numbers, then I cannot see how random variables make sense.
For example, the outcomes of an experiment are three colors, yellow, green and blue.
Now, assign 1 to yellow, 3 to green and 4 to blue and do your calculations.
Assign new numbers, 2 to yellow, -4 to green and 0 to blue.
Then you'll have very different calculations ....

3

There are 3 best solutions below

1
On BEST ANSWER

It really depends on the kind of computations you are doing. Since $\{\text{yellow},\text{green},\text{blue}\}$ don't have an natural correspondence to numbers, and no natural ordering, asking for example for the mean or median makes little sense. The result would, as you realized, depend on the precise mapping of yellow, green and blue to some numbers.

Given some random variables $X_1,X_2,\ldots$ whose outcomes are $a \in \mathbb{R}$ for yellow, $b \in \mathbb{R}$ for green and $c \in \mathbb{R}$ for blue you can, however, still ask "Whats the probability that $X_1 = X_2$". That answer for that won't depend on the precise choice of $a,b,c$ (though you'll have to pick different $a,b,c$ for the three outcomes, obviously).

The reason that random variables are defined as taking real values instead of values from some arbitrary set is that it makes some things simpler to express. That definition, for example, allows you to simply state that "The CDF of a random variable is right-continuous". If you had defined random variables as taking values from some arbitrary set, you wouldn't be able to do that.

So basically you pick a mapping from your actual outcomes to $\mathbb{R}$ which is as natural as possible, and then simply disregard any structure of $\mathbb{R}$ which doesn't apply to your actual outcomes. If your outcomes don't allow addition, well then don't do any computations which add outcomes like computing the mean does. If your outcomes don't have a natural ordering, then don't do anything which uses the ordering on $\mathbb{R}$, i.e. don't compute the median.

0
On

The numbers assigned to the colors should have some meaning. For example, if you have yellow, green, and blue balls, you could have random variables for their size, weight, or other attributes you're interested in.

So the expected weights could be 1kg (yellow), 3kg (green), 4kg (blue), therefore E[Y] = 1, E[G]=3, E[B]=4.

In your actual measurements, you will probably have variation in the weights of the balls, which justifies the use of random variables. I hope that makes things a bit clearer.

0
On

Any function $F:\ \Omega\to X$ defined on some set $\Omega$ and taking values in some other set $X$ becomes a random variable when a probability measure is defined on $\Omega$, to be precise: After we have set up a probability space $(\Omega,{\cal F},\mu)$. What's random about it is that fate chooses the point $\omega\in\Omega$ where $f$ will be evaluated. When $X={\mathbb R}$, fine, you will be able to add or multiply two random variables $F$, $\>G$ then, or to compute the expectation of $F$. When $X=\{{\rm red,\ green,\ blue}\}$ then it is impossible to add function values $F(\omega)$, $\>G(\omega)$, but it makes sense to ask for the probability that we obtain two times "green" in a row (which enforces the construction of a new probability space $\Omega\times\Omega$).

As an example consider a map $\Omega$ of the United States and define $$F(\omega):= \hbox{"state in which}\ \omega\ \hbox{lies"}\ ,$$ and the probability measure would be area divided by the total area of the US. When you throw a dart a this map a random point $\omega\in\Omega$ is chosen, and it makes sense to ask about the probability that $F(\omega)$ begins with an 'A'. But you cannot talk about the mean or the variation of $F$ without setting up additional structural elements, like distances between states.