Average of random numbers

1.5k Views Asked by At

If there was an unbiased machine which outputs perfectly random real numbers between 0 and 1, and we take a large number of these outputs and take the average of all of them, then would that average tend to $\frac{1}{2}$? Or would it again tend to some random number?

2

There are 2 best solutions below

2
On

Depends on what "random" means and how you select your "large number" of outputs. If random means uniformly distributed and independent, and you select the outputs uniformly/independently, then yes, the mean will tend to $\frac{1}{2}$. This is based on the Law of Large Numbers, which says that if you repeatedly select independently from a probability distribution, the average converges to the expected value of the distribution as the number of selections tends to $\infty$.

But "random" by itself doesn't mean that probabilities are independent or uniformly/evenly distributed. For example, consider the probability distribution $f(x) = 2x, \ x\in [0,1]$. Here, you're much more likely to randomly select values of $x>\frac{1}{2}$ than you are values of $x< \frac{1}{2}$, and the expectation is in fact $\mu = \frac{2}{3}$. As for independence, consider, for example, an autoregressive pattern, where the value of the latest output is randomly selected from a probability distribution, but that that probability distribution is correlated with or conditional on earlier outputs.

Similarly, although you say the machine generates values without bias, if there's bias in the selection criteria of your "large number" of outputs, that can also affect the observed sample mean through a selection bias.

0
On

would that average tend to $\frac12$? Or would it again tend to some random number?

The average will indeed be a random number (a "random variable"); and that random variable will tend (in some precise sense) to $\frac12$.

In which sense? In several senses, actually. To begin with, by the (weak) law of large numbers, you can assert that as $n$ (number of samples) grows, the probability that the average is not near $\frac12$, (say the probability that the average is below $0.4$ or above $0.6$) tends to zero.

You can say more. By the CLT, for large $n$ the distribution of the average (which, remember, is itself a random variable) will tend to a gaussian distribution with mean $\frac{1}{2}$ and standard deviation $\frac{1}{\sqrt{n 12}}$... which tends to zero as $n$ grows.

All this, granted that the numbers produced by your unbiased machine are uniformly distributed in $[0,1]$ and independent.