Calculating Birthday Problem probabibilites for different number

62 Views Asked by At

A fellow programmer I know is designing his system around a random ID per "active" user, however this ID only has 1 million different possibilities. Thinking of the Birthday Problem I thought to myself that this isn't very safe even with a small userbase at first.

I was wondering how I could calculate for a set of different possibilities (1 million in my case) how many users you could use before you get a percentage chance of collision? I have tried solving this programatically and I came up with about an average of 1250 users where the most collisions started appearing. I found some formulaes but they all seem to use giant factorials that my programs / calculators can't perform.

Is there any way to solve this with a single equation (instead of recursion), something like:

With x different unique ID's, you'd expect a y% chance of a collision at n users. (x and y being input)

Sorry if I'm using the wrong definitions, just a programmer with enthusiasm for math :)

1

There are 1 best solutions below

4
On BEST ANSWER

The Wikipedia article says the number needed for $50\%$ chance of collision where there are $n$ possible days is about $\sqrt {2 \ln 2 n}$. That has no large factorials in it. It is about $1177$ To deal with large factorials, the standard approach is Stirling's approximation $n! \approx (\frac ne)^n\sqrt {2 \pi n}$ It is quite accurate and if you take logs you avoid the big numbers.