Creating a function that maps a base 36 number into an RGB Color (tuple of 3)

174 Views Asked by At

I am working on a code that needs to do the following:

imagine I have a string of characters, for example: $\text{dk48vns203lvm923dpvgj39dkv}$

The characters can be $0,\dots,9$ or not capital letters $a,\dots ,z$, so $36$ options per character. I would like to create a function that maps a part of this string into an RGB color. What it means is that I need $3$ $0-255$ numbers for example $\text{fj3k0} = (233,9,48)$. So thinking about it, the total options for an RGB color is $256^3 = 16\ 777\ 216$. Each character has $36$ options so if we look at $4$ chars it's $36^4 = 1\ 679\ 616$ and $5$ chars is $36^5 = 60\ 466\ 176$. I want a function to be as unique as possible and uniformly distributed. I have a working solution right now but I believe it can be improved.

My solution:

Take $6$ chars, every $2$ chars will map to a different $256$ number ($2$ chars for R, $2$ chars for G, $2$ chars for B).

These $2$ chars are mapped just like a base $36$ number which means $00$ is $0$ and $\text{zz}$ is $1296$ $(36 \times 36^1 + 36 \times 36^0)$. I then take the calculation and normalize it for $255$.

Main problems:

  • Not really unique.

  • Because of the normalization, a lot of numbers become real numbers, for example $700$ can be something like $146.4345$. RGB is integers so this is rounding down which ruins the uniqueness even more.

  • I am using $6$ chars. $6$ chars can potentially map to $36^6 = 2\ 176\ 782\ 336$ unique RGB tuples, but this mapping struggles to find $16\ 777\ 216$ inside them so this is kind of a waste in information.

Thanks

2

There are 2 best solutions below

0
On BEST ANSWER

You could take the remainder of the number represented by the two characters divided by $256$, i.e. if the number represented is $n$, then calculate $n \bmod 256$

... a small problem is that two characters have $36^2=1296$ combinations which is $5\cdot256+16$, and so it 'favors' the numbers $1$ through $16$ ... although only $16$ out of $1296$ times, which is a little more than $1$ out of $100$ times ... I don't think that's bad, and if that is acceptable to you as well, then go for it.

Otherwise, you can take $6$ characters, take the number $m$ represented by that, calculate $m'=m \bmod 256^3$, and then compute the RGB values as follows:

$$R = m' \bmod 256$$

$$G = \frac{m'-R}{256} \bmod 256$$

$$B = \frac{m'-R-256\cdot G}{256^2}$$

You will still have numbers that are still more likely to be the outcome than other numbers, but this time that happens only about $1$ in $200$ times.

3
On

It's very inconvenient to represent colors using base 36 as norm. The most common standard datatype and representation of a pixel (that can choose 1 out of 16 million colors) are represented using 32-bit datatypes, not 36. Red, green, blue and alpha channels use 8-bits each for this. We are left with 4 bits which we can use to represent the next pixel in the line. This is sometimes refered to as pixel-packing, some video-modes in the past used this for different color-modes and if I recall the SIMD-instruction set use some similar technique. But I have a feeling that base-36 is more complicated than aligned representations.

Now, since your question regards the use of 36 symbols instead of 16 (as in hexadecimal) (0-9 and a-f), we get $\frac{36}{16} = 2,25$ bytes for a type of 36-symbols. $16$ symbols can be represented using $log_2(16) = 4$ bits. 36 symbols can be represented using $log_2(36) = 5,1699$... bits. Since bits act like flags (that are either on or off), it has to be discrete and hence not real.

Imagine a processor-register having 36 bits. We then have $2^{36} = 68719476735$ combinations, this is alot more than 16 million colors. Colors would be repsented like this:

[ggggbbbbbbbbrrrrrrrrggggggggbbbbbbbb] register 1.

Here we see that the second pixel is the bits on the left and has some missing bits.

It is not convenient to have a real part, so instead of representing 4,5 channels we represent $9$ channels using two symbols each time instead of one. This means the total number of symbols that we store has be an even number.

Two symbols would use two $36$-bit registers and look like this:

[rrrrrrrrggggggggbbbbbbbbrrrrrrrrgggg] register 2 (continue of 1).
[ggggbbbbbbbbrrrrrrrrggggggggbbbbbbbb] register 1.

Now we see that the two registers represent 3 colors. i.e. "packed pixels". We used base-2 above, where r,g,b are bit-flags. So it doesnt matter what base-representation you use for visualizing whats going on. Example: $8g_{36}$ is the same as $100110000_2$. They are just different representations.

In your example you use a 26 char string:

dk48vns203lvm923dpvgj39dkv

It fits the description above about representing colors by an even amount of symbols. And 26 is an even number. You are not using the alpha channel, so 3 channels (24-bits) represent one pixel. You can then pack 3 pixels using two symbols as we showed above. This means that your string of 26 characters/symbols represent $\frac{26}{2}*3 = 39$ pixels.

Normalizing numbers is not needed if you restrict your string to an even number of symbols and also normalizing can complicate the programming and encode/decode. If you want to do packing. Using real numbers for color representation can overcomplicate things further. Stick to simple solutions first, and then if you are considering jpeg and such compression have a knowledge about how colors are represented in base-2.

You need some logic operations and doing bit-manipulation to convert these pixel-formats.