What is the theory behind seeding a CRC?

874 Views Asked by At

The basic CRC check is a polynomial division in this form:

$$ Q(x)Gen(x) + Rem(x) = \frac {x^nMsg(x) } { Gen(x) } $$

It's recommended that the polynomial is seeded so that an all-zero message does not result in an all-zero checksum too.

As far as I can tell, with a seed the polynomial division equation becomes:

$$ Q(x)Gen(x) + Rem(x) = \frac {x^{2n}Seed(x) + x^nMsg(x) } { Gen(x) } $$

But if the message Msg(x) is zero, then is there not a case that the generator polynomial could be a factor of the scaled seed, and we get a zero remainded Rem(x)?

1

There are 1 best solutions below

1
On

The OP's "equations"are not quite right. If the data that is to be transmitted is represented by $d(x)$ and the CRC polynomial is $C(x)$ of degree $n$, then the no bells and whistles CRC algorithm divides $x^nd(x)$ by $C(x)$ to produce a quotient polynomial $q(x)$ and a remainder polynomial $r(x)$ of degree smaller than $n$, the degree of $C(x)$. The CRC checksum is $r(x)$. The equation describing the relationship is $$x^nd(x) = q(x)C(x) + r(x)\tag{1}$$ and not what the OP has written. Now from $(1)$, we get that $$q(x)C(x) = x^nd(x) - r(x) = x^nd(x) + r(x)\tag{2}$$ where the second equality is using the fact that in the binary field, addition and subtraction are the same operation. Thus, $x^nd(x) + r(x)$ is a multiple of the CRC polynomial $C(x)$. What is transmitted is $x^nd(x) + r(x)$. But notice that the coefficients of $x^0, x, x^2, \ldots, x^{n-1}$ in $x^nd(x)$ all are $0$ while $r(x)$ is of degree $n-1$ or less, and so we see that what is actually transmitted is the data bits (the $x^nd(x)$ part) followed by the CRC checksum $r(x)$: the trailing $n$ zeroes in $x^nd(x)$ have been neatly filled in by the CRC checksum.

Now, if $d(x)$, the data to be transmitted, is a long string of zeroes, then $(1)$ and $(2)$ show that the CRC checksum is a string is $n$ zeroes, and so the entire transmission is a long string of zeroes. This is deemed to be unsuitable for a variety of reasons, and there are various bells and whistles versions of the CRC computation that avoid the possibility that a transmission consists entirely of zeros. One of these bells and whistles versions prepends a nonzero header to the data to be transmitted (the header is discarded at the receiver). Thus, if we have $M$ bits of data (and so $d(x)$ is of degree $M-1$) and the header $h(x)$ is $m$ bits in length, then what gets divided by $C(x)$ is not $x^nd(x)$ but instead $x^{m+M+n}h(x) + x^nd(x)$ which stream of bits is the header $h(x)$ followed by the data string $d(x)$ followed $n$ zeroes. What gets sent out over the channel (or recorded in memory or disk or flash drive) is $x^{m+M+n}h(x) + x^nd(x) + \hat{r}(x)$ where $\hat{r}(x)$ is the remainder when $x^{m+M+n}h(x) + x^nd(x)$ is divided by $C(x)$; it is different from the remainder when $x^nd(x)$ is divided by $C(x)$. In particular, even if $d(x)$ is a stream of zeroes, $\hat{r}(x)$ is nonzero (unless the system designer not thought through the problem and has chosen $h(x)$ to be $C(x)$ itself (or a polynomial multiple of $C(x)$ which should be a fireable offense!) The CRC checksum being all zeroes is not an issue now even if $d(x)$ is all zeroes: the header and CRC checksum are nonzero. Another alternative is to simply complement the first two data bytes (used if $C(x)$ has degree 16) or four data data bytes (if $C(x)$ has degree 32) before passing them to the division circuit that computes the remainder from the division of $x^nd(x)$. The receiver must, if curse, remember to undo the complementation suffered by the leading data bytes at the transmitter. This method does not increase the transmission length but has the obvious disadvantage that if the data happens to be the size of the US national debt (a number in the trillions with a large number of trailing zeros), complementation of the leading bytes at the transmitter might modify the transmitted data stream to be all zeroes!