How do I design a hash function for this given case?

119 Views Asked by At

My professor gave me the following problem.

You are to store objects identified by integers from the interval $[0..10^9 − 1]$. You expect never to have to store more than 1400 objects. Design the hash function h of the type discussed in class (universal hashing). That is, select a prime number on which the hash function will be based, the ”dimensionality” of your hash funciton, say k, and k parameters $\{ a_0, . . . , a_{k−1}\}$. Given your selections, what is $h(350620501)$? Explain all choices you made.

I've developed the following answer.

Generate a prime number p that is greater than the range of keys that the objects in the table will be identified by $[0…(10^9-1)]$. The first prime greater than $10^9$ will do for this example, so let $p = 1000000007$. Because the data type to be hashed are integers, two randomly generated integers modulo p are chosen to add randomness to the hash function, so let $a = 63873076$ and $b = 87191394$. The number of slots in the table will be the maximum number of objects we will be expecting to store, so let $m = 1400$. The universal hash function will be defined as:

$h(x) = ((ax + b) \mod p) \mod m$

$h(350 620 501) = 7$

This is my first time generating a hash function, so I wanted to ensure that my solution and understanding was correct--specifically, my thinking behind choosing the value for m and the reason that values a and b are added to the function.