Encoding conditions into single number

715 Views Asked by At

I have N boolean conditions such as: [condition_1:true], [condition_2:true], [condition_3:false], [condition_4:true].

I can present it as following:

  1. Using binary numbers, e.g., 1101
  2. single int32. I can also assign to each condition a prime number and present the condition by multiplication of prime numbers, for instance: [condition_1:true -> 2], [condition_2:true -> 3], [condition_3:false -> 5],[condition_4:true -> 7]. I can present these conditions by: 2x3x7 = 42 And using prime factorization, I can go back to conditions if I'm given the number 42; which is a great point indeed.

What's the drawback ? These two methods work perfect except for the number of the conditions they allow. If I have, lets say 2,000 conditions non of the two will work.

Why? Obvious: Would it be possible to multiply the first 2,000 prime numbers and keep the result?

What could we do then? A computer engineer would simply say: make an string of arbitrary length and code this condition to it. Well, this is probable but it's a waste of memory which is beyond the demand here.

Another computer engineer would say: well group all conditions into a 32-bits group and store them as an array of int32. This is a good idea, but still it's not as optimal as I need.

I'm looking for a mathematician mind:

  • How can you map a vector on a single number, maybe in another space with higher-dimensions ?
  • You can superimpose a square matrix into a eigenvalue and eigenvector; can you think of similar situation here?
  • Surely you know about hashing functions, can't you think about a function that maps from a bigger domain 2^2000 to much smaller, such as space available for a double, or decimal?
  • Fourier transformation, discreet cosine transformation, and etc. are well accepted in fields such as movie compression, where you keep a key and changes to the key. Can you think about similar situations here?
1

There are 1 best solutions below

1
On BEST ANSWER

If you have 2000 conditions, there are $2^{2000}$ possibilities you must be able to code and analyze. A 2000-bit binary number is the most efficient way to store this information, but most computer systems have no numeric type that large. There is probably no reason to store this information as a single number - at worst you might want to do some bitwise operations like AND and OR with the information, but it’s unlikely you’d have to do arithmetic.

Depending on what system you’re using to store the 2000 conditions, maybe you have a bit-string data type to use. Maybe you can store the first 32 conditions in one integer value (unsigned 4-byte integer), the next 32 in a second integer, and so on, and use an array of integers to hold everything.

If you have to “roll your own” method to handle this information, you could also use strings of characters. You could waste space but make the results readable with a 2000-character string of 0s and 1s, or you could use “base 64” encoding so that 7 values could be encoded in a single character using A-Z, a-z, 0-9, $, and _ (or any other collection of 64 characters that would be interpreted consistently across systems.

It really depends what you need to do. There should be plenty of ideas online if you google “storing bit strings” for example.