How to scale a random integer in $[A,B]$ and produce a random integer in $[C,D]$

1.3k Views Asked by At

I know there are many methods to scale a number from range $[A,B]$ to a range $[C,D]$, and I've searched over and over the web. I've seen this math.SE thread.

I need to scale a big number (signed 32-bit integer) between $-2147483648$ and $2147483647$, to a smaller range (from 0 to millions).

However, the original number is generated by a random number generator that I cannot control in any way, so I must to be sure that the output formula does not alter its randomness (in a way that I can demonstrate - academic papers are well accepted).

I need the correct way to scale this number taking in consideration that:

  • the output range must have as maximum value a power of two (e.g., $4194304$)
  • the formula can be demonstrated
  • the formula does not alter the randomness of the source number

Anyone may be of help?

3

There are 3 best solutions below

9
On

It sounds like the input number is a sequence of 32 bits (interpreted as a signed integer). If the desired output is also a random sequence of $n$ bits, for some $n < 32$, then an easy way is to truncate the input sequence to the first $n$ bits. It's pretty clear that if the input is uniformly distributed, then the output will also be uniformly distributed.

But if the output is in a range of $N$ possibilities, where $N$ is not a power of 2, then it cannot be uniformly distributed among those possibilities. So you won't get your output to be completely random in that case. We can show this with a simple counting argument.

ADDED: A counting argument also proves the uniform distribution mentioned above. For example, suppose the input is uniformly distributed from $-2^{31}$ to $2^{31}-1$ and the desired output is from $0$ to $2^{22}-1$. We can truncate by throwing away the most significant bits with the function $y = (x + 2^{31}) \mod 2^{22}$. The function maps exactly ${2^{32}}/{2^{22}} = 2^{10}$ input values to each output value. We assumed that the probability of getting any particular input value is $1 / 2^{32}$, so the probability of getting any particular output value is $2^{10} / 2^{32} = 1 / 2^{22}$, which means that any output value is equally likely.

5
On

This works in special cases:

Let $[a,b]$ be an interval and consider a uniform distribution on the intervals in this interval. Let $k$ be the number of integers in $[a,b]$.

Let $[c,d]$ be another interval and suppose that the number of integers in this interval is $l$.

BIG ASSUMPTION: Suppose that $l\mid k$.

Then, any $\frac{k}{l}=m$ to $1$ map will give you a uniform distribution on $[c,d]$.

For example, if you have the number $n$, you could take $n\pmod l$ and then add $c$.

If you don't have the BIG ASSUMPTION, then you may be able to weight the map described here, (and not have it exactly $m$ to $1$, but that is more challenging and situation specific).

0
On

As other answers have noted, this is impossible; in particular, you start with a discrete distribution, where each outcome has a probability of $2^{-32}$. Now, if you take some function $f$ taking in such random outcomes and outputting an integer in the desired range, then it's clear that the probability of receiving some $n$ as the output equals $2^{-32}$ times the number of outcomes $x$ such that $f(x)=n$ - in particular, the probability is always a multiple of $2^{-32}$. Unfortunately, numbers like $\frac{1}3$ are not multiples of $2^{-32}$, so we can't get a uniform distribution on three numbers (be we can get a uniform distribution on $2^n$ outcomes for $n\leq 32$ just by truncating the number).

However, here's a standard and simple procedure to generate a random integer in $[1,N]$ given the capacity to generate a random one in $[1,M]$ for $M\geq N$. Your problem is easily under this class.

Suppose that $M=aN + b$ where $a$ is an integer and $b$ is positive and less than $N$ - that is, $b$ is the remainder of $M$ divided by $N$ and $a$ is the integer part of the quotient. Generate an integer $x$ in $[1,M]$. If it is less than or equal to $aN$, return the remainder of $x_0$ divided by $N$. Otherwise, generate another integer in $[1,M]$ and test if it is lower than $aN$. Repeat indefinitely.

That is, to generate a random number, you throw out certain results and try again. For instance, if we wanted to generate a random integer in $[1,2]$ given the ability to do so in $[1,3]$, we simply keep finding random numbers in $[1,3]$, until one of them is either $1$ or $2$. This is clearly a uniform distribution. In the very worst case, the above method throws out about $\frac{1}2$ of the results generated. However, if you are generating numbers in $[1,2^{32}]$ and seeking a random number between $1$ and a million, you will accept almost every (e.g. about 99.97%) random number you get, so there's no great loss in efficiency.