How to minimize the maximum Hamming distance of a linear block code.

985 Views Asked by At

I suspect it is possible to choose generators of 2^l so that:

  1. Each number 1-l is in some generator.
  2. The maximum Hamming distance between any two vectors would be at most (l+1)/2.

For instance, suppose the length is l=7, and

a=1111000, b=0001100, c=0010010, d=0111001

In the notation below, I will write "1234567" where the 1's should be, and minuses where 0's should be, to make it easier to read. Then the linear block code generated by a, b, c, d, e is:

{o=-------,   a=1234---,   b=---45--,   ab=123-5--,
 c=--3--6-,  ac=12-4-6-,  bc=--3456-,  abc=12--56-,
 d=-2----7,  ad=1-34--7,  bd=-2-45-7,  abd=1-3-5-7,
cd=-23--67, acd=1--4-67, bcd=-234567, abcd=1---567}

Since it is a linear block code, all the Hamming distances are actually one of the words. So we see that the maximum Hamming distance here is 6. If it were not for the word bcd, the maximum Hamming distance would be 4 = (l+1)/2. I suspect it is possible to choose generators so that:

  1. They would cover all numbers 1-l.
  2. The maximum Hamming distance would be at most (l+1)/2.

Is it possible? For example, for l <= 7 I wrote out all combinations, and this is not possible. But possibly there is a higher l for which it is possible? Or maybe this reduces to some other theorem in coding theory?

And let me know if I can clarify something, since this is my first time trying to figure something out in code theory.

This question is similar to, but should be much easier than, Maximal Hamming distance .

1

There are 1 best solutions below

1
On BEST ANSWER

If I understood your goals correctly they are achieved at least by the so called simplex codes. They are binary linear codes of length $\ell=2^n-1$, dimension $n$, and and with the property that all the non-zero words have Hamming weight $2^{n-1}=(1+\ell)/2$.

These can be constructed by shortening the Reed-Muller code $R(1,n)$ at a single position, or by taking as non-zero words all the cyclic shifts of an $m$-sequence.

With $\ell=7$ use the (Reed-Mullerish) generator matrix $$ G=\left(\begin{array}{c} 0001111\\ 0110011\\ 1010101 \end{array}\right) $$ or the cyclic ($m$-sequence based) variant $$ G=\left(\begin{array}{c} 1011100\\ 0101110\\ 0010111 \end{array}\right). $$


The stuff below is not quite what I wanted it to become. Sorry, it's getting late here...

In general, assume that $C$ is a binary linear code of length $\ell$, dimensiion $n$, has all the bit positions in the support of at least one word (you call this "covers all the numbers") and has maximum weight $(1+\ell)/2$. In a linear code half the words have a one at any covered position, so $C$ has altogether $2^{n-1}$ words that have a $1$ in position $i$ for all $i=1,2,\ldots,\ell$. Therefore the total sum of the weights of the words is $$ S(C)=\sum_{x\in C}w(x)=\ell\,2^{n-1}.\qquad(*) $$ On the other hand, if the maximum Hamming weight is $(1+\ell)/2$, then we get the inequality (observe that the all zeros word is in $C$) $$ S(C)\le(2^n-1)\,\frac{1+\ell}2. $$ Together these imply $$ \ell\,2^n\le (2^n-1)(1+\ell) $$ which is equivalent to $\ell\le2^n-1$.

This can be used to prove the non-existence of such a code for some values of $\ell$. For example with $\ell=5$. This forces $n\ge3$. Assume that a code $C$ of maximum weight $(1+\ell)/2=3$ existed. Then the even weight subspace $C_e$ of $C$ is of dimension $n-1$. All the non-zero words of $C_e$ have weight at most $2$, so $$ S(C)\le 2(2^{n-1}-1)+3\cdot 2^{n-1}=5\cdot 2^{n-1}-2<5\cdot 2^{n-1} $$ in violation of equation $(*)$.