Calculating number of errors that is undetected by a CRC using a particular polynomial

1k Views Asked by At

My goal is to determine optimal generator polynomial and size of the Frame Check Sequence (CRC field) for protocol that I am developing. Reading the paper, "Cyclic Redundacy Code (CRC) Polynomial Selection For Embedded Networks" by Koopman and Chakravarty, provided me with a solid understanding of the topic and proved to be a valuable resource.

Especially interesting for me is Table 1. (provided below), "Example Hamming weights for data word size 48 bits". My question is how to come up with this table on my own? That is, how to calculate number of undetectable errors, provided generator polynomial and data word size? I am not sure if this number is also dependent on bit error ratio (BER).

"Example Hamming weights for data word size 48 bits" from "Cyclic Redundacy Code (CRC) Polynomial Selection For Embedded Networks" by Koopman and Chakravarty

Namely, I want to write script to calculate these, as data provided by authors of the paper do not cover polynomials and data lengths that I have at hand.

Any reference to the material where I could get necessary knowledge to come up with an algorithm is welcome (preferably without entering into field theory).

2

There are 2 best solutions below

0
On

The minimal distance of a code is the minimal weight of a non-zero codeword. The "Hamming weights for data word size 48 bits" for a specific number $n$ of corrupted bits is in fact the number of polynomials $p(x)\in \Bbb{Z}_2[x]$ with $deg(p) < 48, weight(p) = n $ that are multiples of the generating polynomial of the CRC. This is because you're looking for the number of polynomials of weight $n$ (number of bit error schemes with $n$ corrupted bits) that you can add (xor) to any multiple of the generating polynomial (legal codeword), and still get a multiple of the generating polynomial (a legal codeword).

If we'll take for example the CCITT-16 - it's polynomial is $p_G(x) = 1+x^4+x^{11}+x^{16}$. Then you're asking how many polynomials of degree < 48 are multiples of $p_G(x)$.

I know that the problem of finding low weight multiples of a polynomial, with bounded degree $n$ is very difficult. You can take a look at this paper:

https://uwspace.uwaterloo.ca/bitstream/handle/10012/5455/spmul.pdf;jsessionid=D7F3887C75609867944E6FE0158C2FEF?sequence=1

I'll add that if your polynomial has an even hamming weight, then every bit-scheme with an odd hamming weight will be detected. That is because a polynomial in $\Bbb{Z}_2[x]$ has an even hamming weight if and only if it is divisible by $1+x$. Therefore, for any polynomial with an even hamming weight, "The hamming weights for data word size $n$ bits" with an odd number of corrupted bits will be zero.

3
On

For others reading this, the non-zero numbers in the right half of the table are the number of failure cases out of comb((48+crc size), # errors). On the first row, for 4 bit errors, CCITT-16 has 84 failing cases out of comb(64, 4) = 635376 possible patterns. For 6 bit errors, there are 2430 failing cases out of comb(64,6) = 74974368 possible patterns.

To generate these tables, an optimized brute force search is used. Consider the case of looking for all patterns of 6 bit errors in a 64 bit message (48 data, 16 crc). The test message is all 0 bits except for six 1 bits that represent the errors. Rather than actually testing for 6 bit errors, all 41664 combinations of 3 bit errors are used to create a table of CRCs stored along with the indexes of the 3 bits in error. The table is sorted by CRC, and then searched for duplicate CRCs, that don't have duplicate error bit indexes. If there is a duplicate CRC, but two of the error bit indexes are the same in both entries, then that is a 4 error bit CRC that failed. There aren't any 2 error bit failures so no check is needed for that.

Rather than hard coding nested loops to generate all the combinations, there is a generic method known as next combination to generate all combinations of k indexes out of a possible n indexes (in order). Example code, where InitCombination sets up the array so that the first call to NextCombination actually generates the first set of indexes: {0,1,2,...}.

//----------------------------------------------------------------------//
//      InitCombination - init combination to first set - 1             //
//----------------------------------------------------------------------//
void InitCombination(int a[], int k, int n) {
    for(int i = 0; i < k; i++)
        a[i] = i;
    --a[k-1];     // 1st call to NextCombination will return 1st set
}

//----------------------------------------------------------------------//
//      NextCombination - generate next combination                     //
//----------------------------------------------------------------------//
bool NextCombination(int a[], int k, int n) {
int j = k - 1;
    while (j >= 0 && a[j] == n - k + j)
        --j;
    if (j == -1)
        return false;
    ++a[j];
    for (int i = j + 1; i < k; ++i)
        a[i] = a[j] + i - j;
    return true;
}

As for which polynomials can detect the most number of errors for a given message length, generally these will be the product of several irreducible polynomials. These are verified using optimized brute force and supercomputers for larger CRCs.

Here is a link to a "CRC Zoo" table for 32 bit CRCs. Click on "notes page" to understand what the values in the table represent.

https://users.ece.cmu.edu/~koopman/crc/crc32.html