Example of encoding scheme which is not prefix-free

100 Views Asked by At

I am trying to give an example of an encoding scheme that is not prefix-free but for which reading-left-to-right is a valid decoder-recognizer.

I am using code alphabet $=\{0,1\}$. Source alphabet $S$ is composed of letters $\{a,b\}$

I have the simple scheme of:

a $->$ 001

b $->$ 00

Does this satisfy the question? $b$ is a prefix of $a$ but when read left-to-right the code is decodable, correct? For example, 000000100001 is still decodable when read left-to-right.

1

There are 1 best solutions below

0
On

Your question is not precise enough. In your example, compare the words $u = 0000$ and $v = 00100$. When you read these words from left to right, you first encounter $00$, so you may think that the first letter is $a$. However, this interpretation is correct for $u$, but not for $v$. The trick is that you are implicitely using a lookahead.

Codes that can be decoded, from left to right, with a finite lookahead are called codes with finite deciphering delay. They form a class of codes intermediate between prefix codes and general codes. See the book Codes and Automata by Jean Berstel, Dominique Perrin and Christophe Reutenauer. The formal definition is also recalled in this question:

A subset $X$ of $A^+$ has a finite deciphering delay if there exists an integer $d \geqslant 0$ such that the following condition holds: For $x,x'\in X$ and $y\in X^d$, $y'\in X^*$, $xy$ is a prefix of $x'y'$ implies $x=x'$.

I let you verify that your example $\{00, 001\}$ is a code of deciphering delay 1. An interesting example of a finite code with no finite deciphering delay is $\{1, 10, 00\}$: when you decode a word of the form $10^n$ from left to right, you need to read the whole word to decide whether the first block is $1$ or $10$.

[1]