Following my answer here:
Suppose I have n servers, and I want to distribute files evenly between them (same number of files on each server).
Initially n=2 and I use the following function to map a file f to a server n: n = f%2.
Now I have a new server (n=3), and I want to move existing files to the new server, such that all files will be distributed evenly again. For this I use the following scheme:
f%6 Old n New n
0 0 0
1 1 1
2 0 0
3 1 1
4 0 --> 2
5 1 --> 2
Of course I want to minimize the number of files I move.
To locate a server after the change:
f%6 in [0,2] --> n=0
f%6 in [1,3] --> n=1
f%6 in [4,5] --> n=2
Now, there is a fourth server available (N=4). We need to move 1/3 of the files to the new server (In general, when the nth server is added, we need to move 1/(n-1) of the files).
I use the following file redistribution scheme (other schemes are possible as well):
f%12 Old n New n
0 0 0
1 1 1
2 0 0
3 1 1
4 2 2
5 2 2
6 0 0
7 1 1
8 0 --> 3
9 1 --> 3
10 2 2
11 2 --> 3
To locate a server after the change:
f%12 in [0,2, 6] --> n=0
f%12 in [1,3, 7] --> n=1
f%12 in [4,5,10] --> n=2
f%12 in [8,9,11] --> n=3
In general, I am constructing a consistent hashing from 1..lcm(1,2,...,N) to 1..N.
When N grows, lcm(1,2,...,N) grows fast, and I don't want to store a huge mapping table.
No to my question:
Is there a way to analytically calculate server(f) for a given value of N using the suggested method?
(f is the file Id, N is the number of servers)