I am toying with a genetic algorithm for the first time to evolve a very simple neural network. For the purpose of rendering, I would like to assign my agents a color based on their 'genome', such that similar genomes will have similar colors. A genome, in my case, is an arbitrary length array of 16-bit 'genes', although all agents in a given run will have the same length genome.
As such, I want to create a function which will convert a $16*n$ bit-string into a 3-tuple with each element in the range $[0,255]$, ie. an RGB (or HSL, etc.) color. Specifically, I want bit-strings with few differing bits, regardless of position, to result in colors which are closer than bit strings with many differing bits.
There are many ways to trivially map an arbitrary string to a color if similarity does not matter. You could split the string into 3 equal (+/-1 bit) sections and then map those smaller byte strings to $[0,255]$ based on their maximum possible value. Ie. for an n-bit substring, divide its integer value by $2^n-1$. This method however does care about bit position, as higher order bits will have a much greater affect on the color.
My only other thought was to compare to some fixed constant using a logical operator like and or xor and then counting the number of bits of difference. Since I know all genomes will be of the same arbitrary length, I figured comparing to the same sized string of all 1's. So a string of all 1's would result in the value 0 (for 0 differing bits) and a string of all 0's would give a value equal to the number of bits (or 1 if I divide by the number of bits). This does give me a good way of assigning a 'difference' value that is independent of bit-position, but it also has several problems. First, a single linear function is not ideal to generate a full 3D color-space (perhaps with a space filling curve?). Secondly, and more disqualifying, very different strings which are equally different from whichever constant I choose to compare to would end up with the same value from this function, eg. 10 and 01 are 100% different from each other, but both 50% different from the string 11; I wouldn't want these to have similar colors. Perhaps this approach would be appropriate for saturation or brightness in a HSL/V color space, but definitely not hue.
I assume that this problem has no perfect solution; there's just too many ways two bit-strings can be different from each other and too little color space. Maybe some combination of both of these solutions could get me close enough.
This is not a complete or optimal solution, but have you considered using an error-correcting code? Every codeword is a $n$-bit string. Every pair of codewords differ in at least $k$ bits, which means that it can detect $k$ errors, or correct $(k-1)/2$ errors by finding the closest codeword to the received message. If you assign a particular colour to each codeword and all the bit strings that decode to it, then bit strings will only get the same colour if they are 'close' to one another. However, bit strings close to the boundary between one codeword's domain and another will get assigned different colours.
One approach to fixing that is to pick several codes working at different resolutions (i.e. differing minimum distances), and then mix the resulting colours together. (BCH codes, for example, can be designed with any desired minimum distance.) However, you ideally need codes with a message length the same length as your genome, which is massive by the standards of the codes I know. I don't know enough coding theory to say how hard it would be to construct a suitable set of codes.