Discover parameters of a Reed-Solomon code from its output

1k Views Asked by At

chirp.io is a site/app for sharing e.g a photo identified by a short FSK audio chirp. The chirp is 10 symbols of data, then 8 symbols of error correction. These symbols are 32-valued (5 bits/symbol) and the error correcting code is a Reed-Solomon code. Thus, a chirp is the output of an $ RS[n,k,t]$ $= RS[18, 10, 4]$ encoder over the field $GF(32)$ or $\mathbb F_{2^5}$. These details are from a partial description of the chirp.io protocol

An example chirp is gfhd9532dm (base 32) for which the error parity symbols are 4fbeu0mo. Given this information is it possible to determine the other parameters (e.g. the generator polynomial of the Galois/finite field) of the coder, in order to check/correct a received chirp?

So far I've systematically tried candidate parameters (i.e brute force search) with trials.py but without success.

ETA: api.chirp.io returns a chirp (and parity symbols) in response to a JSON POST. e.g.

$ curl -X 'POST' -H 'Content-Type: application/json' \
       -d '{"body":"abc","mimetype":"text/plain","title":"abc"}' \
       'https://api.chirp.io/0/chirp'
{"longcode": "ovkp99793iao89q5ku", "shortcode": "ovkp99793i", "is_new": 1}

I'm guessing that making more than a few such requests would trigger blocking or rate limiting, and doing so without express permission isn't something I'm willing to do.

ETA 2017-02-05: The original Chirp app has been discontinued, the above request now returns HTTP 404.

1

There are 1 best solutions below

0
On BEST ANSWER

From just the information given (that one codeword of a systematic $RS[18,10,4]$ code is gfhd9532dm4fbeu0mo), it is not possible to identify what the code is or how the finite field was set up as binary $5$-tuples etc. The problem is better viewed as attempting to determine a hashing function that maps gfhd9532dm onto 4fbeu0mo from just the results of this single hash. Many hashing functions will give this result, and even if one is found (and even if it seems to fit what a Reed-Solomon encoder would be expected to do), there is no reason to believe that the function is the correct one and will work on the next 10 data symbols too. From a cryptographic perspective, you have a known plaintext attack with a single plaintext available. Multiple known plaintexts would be better, and a chosen plaintext attack where you are allowed to specify the data symbols (especially with multiple chosen plaintexts) would be even better.