Generate a set of random numbers with an average evenly distributed between two given values

3.5k Views Asked by At

1) I generate 1000 random numbers between 0 and 10 and take the average.

If I do the above action "many" times the resulting average values will be a normal distribution over 0 to 10. Correct?

What I want after "many" iterations of generating 1000 random numbers (+ some manipulation) is to produce average values between 3 and 7, distributed evenly between 3 and 7.

What's my approach here?

3

There are 3 best solutions below

2
On

You could generate $1000$ randoms, average them to find the mean $\mu_0$, generate one more random evenly distributed between $3$ and $7$ for $\mu$, the mean you want and add or subtract $\mu-\mu_0$ from all the original $1000$. This will likely shift some of the randoms out of the original interval. Does this meet your needs?

2
On

Let the average be denoted by $X$. We know that:

$$X \sim N(\mu,\sigma^2)$$

where

$\mu$ and $\sigma^2$ depend on the range over which you sample.

The problem then is how to transform $X$ to be uniform between 3 and 7. We can use the probability integral transform to accomplish this. Specifically, if we let:

$$Y = F_X(X)$$

where $F_X(.)$ is the cdf of $X$.

Then it follows that:

$$Y \sim U(0,1)$$

We can then rescale $Y$ to get the desired values to lie between 3 and 7 as follows:

$$Z = 3 + 4 Y$$

It is clear that $Z$ has the required property of being uniform between 3 and 7 as required.

Edit

The algorithm to follow would be the following:

Step 1: Generate 1000 random numbers uniformly between 0 and 10.

Step 2: Compute the average.

Step 3. Repeat steps 1 and 2 1000 times. So, now you have 1000 sample averages which should follow a normal distribution because of the central limit theorem.

Note: From statistical theory we know that the average values follow a normal distribution with mean 5 and variance $\frac{100}{12000}$

So. you could replace steps 1-3 above by drawing 1000 random variables from $N(5,\frac{100}{12000})$.

Step 4. For each one of the averages from step 3 (denote by X), compute $Y = F(X,5,\frac{100}{12000})$.

Here $F(X,5,\frac{100}{12000})$ is the cdf of the normal distribution centered at 5 and with variance $\frac{100}{12000}$

So, now we have 1000 values of $Y$ corresponding to the 1000 sample averages (i.e., $X$) from step 3. But, $Y$ is uniformly distributed between 0 and 1. Hence you need to transform each one of the values of $Y$ as outlined in step 5.

Step 5: Compute $Z=3+4Y$

Thus, you now have 1000 values (i.e., $Z$) which are distributed uniformly between 3 and 7 as desired.

0
On

The sample mean of $n=1000$ random numbers uniformly distributed between $0$ and $10$ is approximately gaussian with mean $\mu=5$ and standard deviation $\sigma=10/\sqrt{12n}\approx1/11$, hence you are asking for a method to transform some gaussian random variables into uniform ones.

Inverting the gaussian CDF is usually considered as rather unwieldy, due to the lack of manageable closed form expression. Instead one can use Marsaglia polar method to get uniform random variables from gaussian ones, that is, in the inverse direction one usually uses it.

Thus, one starts with two independent gaussian random variables $X$ and $Y$ of mean $\mu$ and variance $\sigma^2$ whose values are the ones computed above, one replaces them by reduced versions $X_0=(X-\mu)/\sigma$ and $Y_0=(Y-\mu)/\sigma$ and one computes $U_0=\exp(-\frac12(X_0^2+Y_0^2))$. Then $U_0$ is uniformly distributed on $(0,1)$ hence $U=3+4U_0$ is uniformly distributed on $(3,7)$.

In summary, starting from $2n$ independent values $\xi_k$ uniformly distributed between $0$ and $10$, one considers $$ U=3+4\exp\left(-\frac3{50n}V\right),$$ where $$ V=\left(\sum\limits_{k=1}^n\xi_k-5n\right)^2+\left(\sum\limits_{k=n+1}^{2n}\xi_k-5n\right)^2. $$ Of course, this is rather strange since $U$ is roughly distributed as every $3+\frac25\xi_k$.

Or... what you mean is that the random variables $\xi_k$ you are starting from are in fact integer valued and uniformly distributed on $\{0,1,\ldots,10\}$. The sample mean of $n$ such random integers is approximately gaussian with mean $\mu^D=5$ and standard deviation $\sigma^D=\sqrt{10/n}=1/10$. The rest of the simulation procedure applies, hence one should consider $$ U^D=3+4\exp\left(-\frac1{20n}V^D\right),$$ where $$ V^D=\left(\sum\limits_{k=1}^n\xi^D_k-5n\right)^2+\left(\sum\limits_{k=n+1}^{2n}\xi^D_k-5n\right)^2. $$ Edit The values of $U$ and $U^D$ are always in the interval $(3,7]$, for every sample $(\xi_k)_{1\leqslant k\leqslant 2n}$.