Can someone clarify to me the difference between:
- generating a random variable (according to a certain distribution) versus
- generating a random number (according to a certain distribution)?
I am thoroughly confused by this concept, because my goal is simply to generate a set of data points according to a certain distribution, and every single reference out there is about generating random variables according to a certain distribution.
Mathematically speaking, A random variable is a function. A random number is a scalar. Totally different thing.
For example, this pdf (http://opim.wharton.upenn.edu/~sok/papers/s/rv.pdf) is titled "Generating a random variable" and starts an example with "the most widely used method of generating pseudo-random numbers are the congruential generator". But the notation for generating the pseudo-random number follows the convention of random variables.
In this reference, it talks about generating random variables with the rejection-sampling method. http://www.columbia.edu/~ks20/4703-Sigman/4703-07-Notes-ARM.pdf But I thought this method was used for generating random numbers (which is the end-goal for everyone)?
What is the distinction between these two concepts?
A random variable $f$ is a typically defined as a real valued measurable function. There are various characterisations, a typical one is that the set $f^{-1}((-\infty,\alpha])$ is a member of some specified $\sigma$-algebra. There is not necessarily a single distribution associated with the function. The term random is a misnomer, there is nothing random in the colloquial sense.
The constant function $f(\omega) = 1$ is a perfectly well behaved random variable with nothing random about it whatsoever.
A random variable is not generated as such, it is defined.
A random number generator is understood loosely as some process that produces a sequence of numbers whose statistics approach some specified ideal.
Aside:
Here is a nice collection of random variables (functions) stolen from Kac's wonderful monograph "Statistical Independence in Probability, Analysis and Number Theory" for your amusement.
Let $r(x) = (-1)^{\lfloor x \rfloor}$, and let $r_n(x) = r (2^n x)$ for $x \in [0,1]$ and $n \in \mathbb{N}$.
The $r_n$ are known as the the Rademacher functions. If you plot a few you will see that they are square waves of different frequencies.
Clearly there is nothing random about the $r_n$, however we can prove many results that have a probabilistic flavour, such as $\lim_{n \to \infty} {r_1(x)+\cdots + r_n(x) \over n} = 0$ for almost all $x$, or the stronger (& more difficult to prove) $\limsup_{n \to \infty} {|r_1(x)+\cdots + r_n(x) \over \sqrt{n \log(\log n)}}| = \sqrt{2}$ for almost all $x$ (cf. law of the iterated logarithm).