In order to find the appropriate length of chromosomes in GA programming, the author of this book states:
Suppose six decimal places for the variables' values is desirable. It is clear that to achieve such precision each domain
Di = [ai,bi]should be cut into(bi - ai) * 10^6equal size ranges. Let us denote bymithe smallest integer such that(bi - ai) * 10^6 <= 2^mi - 1. Then, a representation having each variablexicoded as a binary string of lengthmiclearly satisfies the precision requirement. Additionally, the following formula interprets each such string:
xi = ai + decimal(1001...001) * (bi - ai)/(2^mi - 1)where
decimal(string)represents the decimal value of that binary string.
So here is my question: Why is the author using (bi - ai)/(2^mi - 1)? Why not (bi - ai)/(2^mi)? What is that -1 for?
I searched it and I thought it might have something to do with the Mersenne Prime numbers because of the formulation!! I also checked out the Schema as I thought it might be related to that, but these all seem completely unrelated!
The author uses (2^mi - 1) because it is necessary for the encoding to have values representing BOTH ai and bi (note: the interval is closed, i.e contains both endpoints). The author is using mi bits to represent 2^mi values, so the interval [ai, bi] is being divided to produce 2^mi representable points - and this necessarily divides up the interval into (2^mi - 1) sub-intervals. That's a general property, by the way - if you mark N points inside a closed interval, you produce N+1 sub-intervals [to see this, note that each point -except the last- starts one of the sub-intervals]. Here N is (2^mi - 1).