Forward Error Correction in $32$ bits

199 Views Asked by At

I'm a software dev (Info theory is not my strong point) and have been given a coding requirement as follows...

  • We need to create around half a million unique unsigned $32$ bit numbers that will be assigned to individual users as base 10 to enter into a field (the final number must fit in an unsigned 32 bit field - so in the range of $0$ to $4,294,967,296$)
  • Requirement is to encode some form of Forward Error Correction (FEC) so that if a user makes a typo in their assigned number - of at least a single digit, we can error correct.
  • We have no control over the data entry, so using a few digits for a checksum to detect and prevent errors at time of input isn't possible. We want to be able to detect and correct a mistyped entry after the data has been saved and sent to us for batch processing.
  • Detection of an error wouldn't be hard as we would have a lookup table of users and their numbers. The recovery part comes in when the lookup fails and we want to attempt to assign the miskeyed number to the correct user.
  • A bad outcome would to have the (hamming?) distance between numbers too close, where a typo from a user resulted in them 'hitting' the number of another user

Is this requirement possible?

Naively if say we used repetitive coding to use $16$ bits for the code and $16$ bits to repeat, we only get $2^{16} = 65,536$ unique numbers. Are there any other techniques we could apply?

1

There are 1 best solutions below

4
On BEST ANSWER

Are you giving out the numbers as base $10$ and having the users enter them that way? Although you can convert to binary and use one of the error correcting codes there, that is not the type of error you want to correct. Presumably hte common errors are mistyping a single digit and transposing two neighboring digits. You could look at the Luhn algorithm used (among other things) for validating credit card numbers. It detects all single digit typos and transpositions of neighboring digits. It does not correct them, however.

I did a quick web search for base $10$ error correcting codes and came up empty, but they must exist.

If I had to homebrew something I would start with eight data digits (you say you need six, but in case of growth), append the Luhn check bit, then append checksums of three groups of six of the nine preceding digits that include each digit twice between them. I suspect you can justify that you can correct any single substitution and any neighboring transposition that way, but I haven't checked it.