I'm trying to find the most suitable algorithm to convert a string $\alpha$ over the alphabet $\Sigma$ of size $| \Sigma | = n$ to string $\beta$ over the alphabet $\Omega$ of size $| \Omega | = n-1$, and vice versa. By suitable I mean the balance of algorithm speed, implementation complexity, and overhead of the string length $| \beta | - | \alpha |$.
I've found several algorithms, but maybe there is some unknown to me. Perhaps you know such algorithms. Here's what I found:
- Conventional byte (character) stuffing. Pretty simple, but worst-case overhead is 200%.
- Consistent overhead byte stuffing (COBS). Also quite easy to implement. Worst-case overhead is much better than conventional byte stuffing.
- Think of the input string as a long number of base $n$, convert this number to base $n-1$. Very complex and computational heavy as compared with previous methods, but worst-case overhead is better if I'm not mistaken.
Anything else?
I'm not sure if this counts, but it's an idea you might be interested in. In real world applications, you can identify a string in the $\Omega$ alphabet that would never ever ever arise in a real message. Neither in the $\Omega$ alphabet, nor in the $\Sigma$ alphabet after the simple embedding from $\Omega$ to $\Sigma$. Then you can use that string to represent one letter from $\Sigma$.
Let's say $\omega$ is such a string, something like "fhqwhgads" in English, although you could get away with something shorter. For notation, let $$\Sigma=\{s_0,s_1,\ldots, s_{n-1}\}\qquad\Omega=\{w_1,w_2,\ldots,w_{n-1}\}$$
Then map $$\DeclareMathOperator{\strings}{strings}f:\strings(\Sigma)\to\strings(\Omega)\quad f(s_i)=w_i\text{ when }i>0\quad f(s_0)=\omega$$ The inverse map would first examine a string $\beta$ for instances of $\omega$ and convert them to $s_0$, and then convert remaining $w_i$ to $s_i$. This relies on you knowing that $\omega$ would never practically arise in a real message from either alphabet.