Generating a deterministic random function that takes into account the date as well as a long ID

1.3k Views Asked by At

In SQL Server there's a function RAND() that if you pass a seed to it will return a pseudo random number based on the seed. No matter when you pass that same seed into this function, it will always generate the same pseudo random number.

I'm a database programmer and am having a tough time deciding on an algorithm that will help me to pass in a seed into RAND() as well as a large number, approximately 10^9, to get a pseudo random number based on the date and ID.

Currently I have something like this:

FLOOR(RAND(@seed * LOG([p].[patientid]) * 1000)

where the patientid is the large number (10^9) and the @seed is some unique integer that is a function of the date today.

The result is a different value for every patientid, and you can go back in time because the @seed is just a function of Datetime.

Can you share with me an algorithm that is a deterministic random function that takes into account the date as well as a long ID and will generate a result that will equal something random between 1 and 10,000, and if you give it the same seed, it will always return the same value ?

1

There are 1 best solutions below

4
On BEST ANSWER

I seems like you are looking for a hash function. You are already using the floor(10000*rand(x)) function as a hash function. If you want to be extra careful about collisions (to prevent increased chances of collision if log(patientid) is somehow related to seed), you could try

FLOOR(1000*RAND(seed + FLOOR(10000*RAND(patientid))))

Because this hashes the patientid before combining it with the seed, it should eliminate any relationship between the two, and give you optimal hash performance.

A much more standard way to do this would be to use a standard hash function like sha-256 which sql server provides. In this case, you would just do SHA256(seed, patientid) which would give you 256 bits of random output. SHA256 takes an array of bytes as the input, so really by SHA256(seed,patientid) I mean something like SHA256(seed+","+patientid) Where the seed and patientid are combined into a single string by concatenating them together. From the output of the sha256 function, you could extract a number between 1 and 10000, or whatever you want.