I know there are many methods to scale a number from range $[A,B]$ to a range $[C,D]$, and I've searched over and over the web. I've seen this math.SE thread.
I need to scale a big number (signed 32-bit integer) between $-2147483648$ and $2147483647$, to a smaller range (from 0 to millions).
However, the original number is generated by a random number generator that I cannot control in any way, so I must to be sure that the output formula does not alter its randomness (in a way that I can demonstrate - academic papers are well accepted).
I need the correct way to scale this number taking in consideration that:
- the output range must have as maximum value a power of two (e.g., $4194304$)
- the formula can be demonstrated
- the formula does not alter the randomness of the source number
Anyone may be of help?
It sounds like the input number is a sequence of 32 bits (interpreted as a signed integer). If the desired output is also a random sequence of $n$ bits, for some $n < 32$, then an easy way is to truncate the input sequence to the first $n$ bits. It's pretty clear that if the input is uniformly distributed, then the output will also be uniformly distributed.
But if the output is in a range of $N$ possibilities, where $N$ is not a power of 2, then it cannot be uniformly distributed among those possibilities. So you won't get your output to be completely random in that case. We can show this with a simple counting argument.
ADDED: A counting argument also proves the uniform distribution mentioned above. For example, suppose the input is uniformly distributed from $-2^{31}$ to $2^{31}-1$ and the desired output is from $0$ to $2^{22}-1$. We can truncate by throwing away the most significant bits with the function $y = (x + 2^{31}) \mod 2^{22}$. The function maps exactly ${2^{32}}/{2^{22}} = 2^{10}$ input values to each output value. We assumed that the probability of getting any particular input value is $1 / 2^{32}$, so the probability of getting any particular output value is $2^{10} / 2^{32} = 1 / 2^{22}$, which means that any output value is equally likely.