If you can cram any more than $2^n$ unique, obtainable values into a binary string

119 Views Asked by At

When you typically talk about binary strings, you basically say that they have $2^n$ values. So 10 is 2, 11111111 is 255, etc.

But this is based on how we use the string. What we are doing in that situation is "take all possible combinations of individual slots of alphabet $\{0,1\}$ as a single set". This gives us 256 possible values in an 8-bit string, etc..

We can also use it differently. Instead of saying "give me all the combinations of an $n$ length string", we can say "fill the values from left to right with 1, and use those as the set". So you have:

1
11
111
....
1...n

So if you only allow 8 slots, then you only have 8 unique values, as opposed to 256. So we can use the same system to encode two different amounts of numbers, or two different sets/ranges of values (8 vs. 256). This comes up when programming a lot. For example, you can only encode 8 flags into an 8-bit number, but 256 integers into an 8-bit number.

So this got me wondering if you can encode more than 256 values into an 8-bit number, or more generally, more than $2^n$ values in an $n$-bit number. But the catch is, you should also be able to get the values out of the string. More on that next.

In some sense you can have more digits in there, as in this.... Take every 1 digit number + 2 digit $2^2$ numbers + 3 digit $2^3$ numbers + 4 digit $2^4$ numbers + 5 digit $2^5$ numbers, + ... n digit $2^n$ numbers. I'm not sure the equation to calculate this final value, but it's more than 256, since 256 is simply the last number in the equation at $2^8$.

But the problem is I don't think you can get the values out of the string. At least it seems hard or that you can't.

The question is if there is actually a way to use these extra values we've squeezed out of the $n$-length string, so that we can store more information in that amount of space. The way this would look is, you might have $10101010$ be the ID of some record, while $101010$ is the ID of another record, which comes from 10101010. To take it further, instead of the value coming at the end of the bit string (which makes sense because that is how you sum the bit string value), you could put it in the middle, which gives you even more possible values. As in 10101010.

There is probably some math equation to calculate how many potential unique combinations/values there are like this, but I'm wondering if they exist such that you could also use them for unique encoding. That is, that you could encode 10101010 and 10101010 and 10101010 into the same 8-bit string, while still giving you at least (if not way more than) 256 values like the basic $2^n$ $n$-bit encoding.

If it's not possible, I'd like to know why.

1

There are 1 best solutions below

4
On BEST ANSWER

You can only encode as many different values as you have different strings. If you allow strings from $0$ to $n$ bits there are $2^{n+1}-1$ of them, so you can encode that many values. You really should consider the string to have one more character which is a terminator because otherwise you don't know when the string is over. That is essentially the extra bit that gives twice as many strings.