Random number generation and collision check

593 Views Asked by At

I was working on a random number generator in NodeJS, I use Math.random() which gives a random number between 0 and 1. I can get a whole number between 0 and X numbers using the following function:

Math.floor(Math.random() * ( X - 0 + 1) + 1)

Now I noticed something. If I want lets say 99 unique numbers I need to keep my maximum range of numbers (X) to at least 9999, i.e double the number of digits. This works where required unique numbers are 999 (3 digits) and when X is at least 999999 (6 digits), and so on.

I don't know why, but this gives me least amount of collisions. Is there a mathematical reason behind this or something? Like is there a rule which says something similar?

Regards

1

There are 1 best solutions below

0
On BEST ANSWER

Further to the comments, let's try to find some bounds for the probability of having at least one collision $$P(k,n)=1-\frac{n!}{(n-k)!\cdot n^k} \tag{1}$$ where $k$ is the sample size and $n$ is the "space" size, $k<n$. For more details see Birthday Paradox wiki article and this question.


Proposition 1. We have $$1-e^{-\frac{(k-1)k}{2n}}\leq P(k,n)\leq 1-e^{k-1-n\log{\frac{n-1}{n-k}}} \tag{2}$$

First of all $$\frac{n!}{(n-k)!\cdot n^k}=\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\left(1-\frac{3}{n}\right)...\left(1-\frac{k-1}{n}\right) \tag{2a}$$ Now, using $$e^{-\frac{x}{1-x}}\leq 1-x \leq e^{-x}, x\in[0,1)$$ and indeed each $\frac{i}{n}\in[0,1)$, for $i=1..k-1$, we have from $(2a)$ $$e^{-\frac{1}{n-1}-\frac{2}{n-2}-...-\frac{k-1}{n-(k-1)}}\leq \frac{n!}{(n-k)!\cdot n^k}\leq e^{-\frac{1}{n}-\frac{2}{n}-...-\frac{k-1}{n}}\iff\\ e^{\sum\limits_{i=1}^{k-1}\frac{n-j-n}{n-j}}\leq \frac{n!}{(n-k)!\cdot n^k}\leq e^{-\frac{1}{n}\left(\sum\limits_{i=1}^{k-1}i\right)}\iff$$ $$e^{k-1-n\left(\sum\limits_{i=1}^{k-1}\frac{1}{n-j}\right)}\leq \frac{n!}{(n-k)!\cdot n^k}\leq e^{-\frac{(k-1)k}{2n}} \tag{2b}$$ Using integral test as a technique, it's easy to show $$\sum\limits_{i=1}^{k-1}\frac{1}{n-j}\leq\int\limits_{n-k}^{n-1}\frac{1}{x}dx=\log{\frac{n-1}{n-k}}$$ Multiply by $-n$, add $k-1$, apply this to $(2b)$ and we have $$e^{k-1-n\log{\frac{n-1}{n-k}}}\leq \frac{n!}{(n-k)!\cdot n^k}\leq e^{-\frac{(k-1)k}{2n}} $$ Multiply by $-1$, add $1$ and we have $(2)$.


A few observations. We have $$k=\underbrace{99..9}_{t \text{ times}}=10^t -1$$ and $$n=\underbrace{99..9}_{2t \text{ times}}=10^{2t} -1=\left(10^t -1\right)\left(10^t +1\right)=k(k+2) \tag{3}$$


Now let's plug $(3)$ into $(2)$ ant take the limit when $k\to\infty$. $$1-e^{-\frac{(k-1)k}{2n}}= 1-e^{-\frac{(k-1)k}{2k(k+2)}}= 1-e^{-\frac{k-1}{2(k+2)}}\to 1-\frac{1}{\sqrt{e}}, k\to\infty \tag{4a}$$ For the other side (to make it simple), I will use Taylor series and the fact that $e^x$ is continuous: $$k-1-n\log{\frac{n-1}{n-k}}= k-1-k(k+2)\log{\frac{k^2+2k-1}{k^2+k}}=\\ k-1-k(k+2)\log{\left(1+\frac{k-1}{k^2+k}\right)}=\\ k-1-k(k+2)\left(\frac{k-1}{k^2+k}-\frac{1}{2}\left(\frac{k-1}{k^2+k}\right)^2+O\left(\frac{1}{k^3}\right)\right)=\\ k-1\left(1-\frac{k(k+2)}{k^2+k}\right)+\frac{1}{2}\frac{k(k+2)(k-1)^2}{(k^2+k)^2}-O\left(\frac{1}{k}\right)=\\ -\frac{k(k-1)}{k^2+k}+\frac{1}{2}\frac{k(k+2)(k-1)^2}{(k^2+k)^2}-O\left(\frac{1}{k}\right) \to -1 + \frac{1}{2} = \\ -\frac{1}{2}, k\to\infty$$ As a result $$1-e^{k-1-n\log{\frac{n-1}{n-k}}} \to 1-\frac{1}{\sqrt{e}}, k\to\infty \tag{4b}$$


From $(2)$, $(4a)$, $(4b)$ and squeezing

Given $(3)$ or $n=k(k+2)$ we have: $$P(k,n) \to 1-\frac{1}{\sqrt{e}}, k\to\infty \tag{5}$$

Which is less than $0.4$, so there is "less than half" chance to obtain a collision.