Largest number of shards for linear erasure codes over a finite field

201 Views Asked by At

I'm wondering about the largest linear erasure codes you can make given that shards are an elements of a particular finite field.

If you have $D$ data shards and $P$ parity shards, and a finite field $\mathbb{F}_q$, the classic $D+P$ systematic Reed-Solomon encoding is to fit a polynomial function $f$ to $D$ points, and then evaluate it at $P$ extra points such that any subset of $D$ points is enough to reconstruct the polynomial and thus the original points. This works as long as $D+P \le q$.

If $D+P > q$, there are still linear codes that can work. For example, in $\mathbb{F}_2$ (and indeed in any field), the $D+1$ code that generates a parity shard as the sum of the data shards, and the $1+P$ code that makes P copies of the one data shard both work.

There can be other usable codes in bigger fields. In $\mathbb{F}_4$, there are also $2+3$, $3+2$, and $3+3$ codes, but no others where $D+P \gt 4$.

What's the largest code you can make for a given finite field? How can I find these codes?

2

There are 2 best solutions below

0
On BEST ANSWER

In the framing of $D×P$ matrices over $\mathbb{F}_q$, the following points can be exhaustively tested and shown to be true for all $q\le 27$:

  • $D×1$ and $1×P$ are possible, for any $D$ or $P$
  • $D×P$ is possible provided that $D+P\le q+1$ (NB: one better than a Vandermonde or Cauchy construction)
  • When $q$ is even, $(q-1)×3$ and $3×(q-1)$ are possible (NB: one better than the above point)
  • The above three points are the limits of possibility; anything larger is impossible

The first point is trivial to construct; take a matrix consisting entirely of $1$ elements.

Constructions are known for the middle two points, based on taking a Vandermonde matrix and then appending an extra column $\begin{bmatrix}0 & \cdots & 0 & 1\end{bmatrix}$ or two extra columns $\begin{bmatrix}0 & 0 & 1\end{bmatrix}$ and $\begin{bmatrix}0 & 1 & 0\end{bmatrix}$. See https://web.mat.upc.edu/simeon.michael.ball/Cardona2011.pdf

The last point is conjectured to be true for all $q$, and this is known as "the MDS conjecture" (Segre 1955). It has been shown to be true for prime $q$, but the general case remains unproven. See https://arxiv.org/abs/1201.5994

1
On

Rephrasing the question: we're trying to find $D×P$ matrices over $\mathbb{F}_q$ such that every square submatrix is invertible. Note that this is using the broadest possible definition of square submatrix: delete any (possibly empty) subset of rows and/or columns, such that the resultant matrix is square. The question is then: what are the largest possible $D$ and $P$ for a given $q$?

This framing has some nice properties: if $D×P$ is possible, then so is $(D-1)×P$ and so is $D×(P-1)$. Conversely, if $D×P$ is impossible, then so is $(D+1)×P$ and so is $D×(P+1)$. Additionally, as transposing has no effect on invertibility, if $D×P$ is possible, then so is $P×D$. Conversely, if $D×P$ is impossible, then so is $P×D$.

If we have such a $D×P$ matrix, then it cannot contain any $0$ elements, as the $1×1$ submatrix consisting of just that element would not be invertible. One neat consequence of this is that the leading element in each row and each column will be non-zero, so we can rescale each row and each column to have a $1$ in the leading position, and this maintains our invertibility requirement. This reduces our search space slightly: we're trying to find $D×P$ matrices over $\mathbb{F}_q$ such that every square submatrix is invertible, no elements are $0$, the leading element of every row and every column is $1$, and no other elements are $1$.

Given this, there are various matrices that do fit the requirement (as do their transposes):

  • $D×1$, where the matrix consists of $1$ in every position
  • $D×2$, provided $D\le q-1$, where the matrix consists of $1$ along the leading row and the leading column, and the remaining $D-1$ elements are all distinct elements from $\mathbb{F}_q \setminus \{0,1\}$
  • $D×P$, provided $D+P\le q$, where the matrix is constructed as a Cauchy matrix and then rows/columns rescaled to make the leading elements $1$
  • $3×3$, provided $q \ge 4$. For $q \ge 6$, this is covered by the previous point, so the only interesting cases are $q=4$ and $q=5$. In $\mathbb{F}_4 = \{0,1,a,b\}$, the following works:\begin{bmatrix} 1 & 1 & 1 \\ 1 & a & b \\ 1 & b & a \\ \end{bmatrix}In $\mathbb{Z}_5$, the following works:\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 1 & 3 & 4 \\ \end{bmatrix}

Some of the previous bounds are tight:

  • $D×2$ is impossible when $D\ge q$, as it would have two copies of some element from $\mathbb{F}_q \setminus \{0,1\}$, and the $2×2$ submatrix including those two copies would not be invertible
  • $3×3$ is impossible when $q\le 3$, as the only possible matrix is the following:\begin{bmatrix} 1 & 1 & 1 \\ 1 & a & a \\ 1 & a & a \\ \end{bmatrix}

The grey area is $2\le D\lt q, 2\le P\lt q, D+P\gt q$. More sophisticated arguments are required to determine what is possible in this area, though I wouldn't expect very much to be possible beyond what is described above.