Consistent hashing using modulo

144 Views Asked by At

Following my answer here:

Suppose I have n servers, and I want to distribute files evenly between them (same number of files on each server).

Initially n=2 and I use the following function to map a file f to a server n: n = f%2.

Now I have a new server (n=3), and I want to move existing files to the new server, such that all files will be distributed evenly again. For this I use the following scheme:

f%6  Old n New n
0    0     0
1    1     1
2    0     0
3    1     1
4    0 --> 2
5    1 --> 2

Of course I want to minimize the number of files I move.

To locate a server after the change:

f%6 in [0,2] --> n=0
f%6 in [1,3] --> n=1
f%6 in [4,5] --> n=2

Now, there is a fourth server available (N=4). We need to move 1/3 of the files to the new server (In general, when the nth server is added, we need to move 1/(n-1) of the files).

I use the following file redistribution scheme (other schemes are possible as well):

 f%12 Old n New n
 0    0     0
 1    1     1
 2    0     0
 3    1     1
 4    2     2
 5    2     2
 6    0     0
 7    1     1
 8    0 --> 3
 9    1 --> 3
10    2     2
11    2 --> 3

To locate a server after the change:

f%12 in [0,2, 6] --> n=0
f%12 in [1,3, 7] --> n=1
f%12 in [4,5,10] --> n=2
f%12 in [8,9,11] --> n=3

In general, I am constructing a consistent hashing from 1..lcm(1,2,...,N) to 1..N.

When N grows, lcm(1,2,...,N) grows fast, and I don't want to store a huge mapping table.

No to my question:

Is there a way to analytically calculate server(f) for a given value of N using the suggested method?

(f is the file Id, N is the number of servers)