Enumerating solutions of incomplete row reduced echelon form over multivariate polynomial ring over GF2

30 Views Asked by At

I'm currently experimenting with finding systems of multivariate equations, matching a given set of inputs with their roots.

I'll explain with a toy example:

Given four "valid" assignments of two inputs $x_0$ and $x_1$ and two outputs $y_0$ and $y_1$

$x_0$ $x_1$ $y_0$ $y_1$
$0$ $0$ $1$ $0$
$1$ $0$ $1$ $1$
$0$ $1$ $0$ $0$
$1$ $1$ $0$ $1$

I want to find a set of equations, which has these four assignments as roots (and possibly more). If these assignments are represented in a matrix like this $$\begin{pmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 0 & 1 & x_0 \\ 0 & 0 & 1 & 1 & x_1 \\ 1 & 1 & 0 & 0 & y_0 \\ 0 & 1 & 0 & 1 & y_1 \\ 0 & 1 & 0 & 0 & x_0*y_0 \\ 0 & 1 & 0 & 1 & x_0*y_1 \\ 0 & 0 & 0 & 0 & x_1*y_0 \\ 0 & 0 & 0 & 1 & x_1*y_1 \\ \end{pmatrix}$$ it can be brought into (row reduced) echelon form (at least as far as possible - the bold entries are not in echelon form). $$\begin{pmatrix} 1 & 0 & 0 & 0 & x_0*y_0 + x_1 + 1 \\ 0 & 1 & 0 & 0 & x_0*y_0 \\ 0 & 0 & 1 & 0 & x_0*y_0 + x_0 + x_1 \\ 0 & 0 & 0 & 1 & x_0*y_0 + x_0 \\ 0 & 0 & 0 & 0 & \mathbf{x_0 + y_1} \\ 0 & 0 & 0 & 0 & \mathbf{x_1 + y_0 + 1} \\ 0 & 0 & 0 & 0 & \mathbf{x_0*y_1 + x_0} \\ 0 & 0 & 0 & 0 & \mathbf{x_1*y_0} \\ 0 & 0 & 0 & 0 & \mathbf{x_0*y_0 + x_1*y_1 + x_0} \\ \end{pmatrix}$$ The bold last five entries in the last column of this form are, what I'm interested in, as they have roots at the assignments I'm interested in.

Unfortunatelly, it turns out, that these are not the only possible solutions to my problem and with a different ordering of rows, the algorithm can return different sets of equations.

I have implemented this in sagemath:

P.<x0,x1,y0,y1> = PolynomialRing(GF(2), 4)
l1 = [1, 1, 1, 1,     1,
      0, 1, 0, 1,    x0,
      0, 0, 1, 1,    x1,
      1, 1, 0, 0,    y0,
      0, 1, 0, 1,    y1,
      0, 1, 0, 0, x0*y0,
      0, 1, 0, 1, x0*y1,
      0, 0, 0, 0, x1*y0,
      0, 0, 0, 1, x1*y1]
A1 = Matrix(P,9,5,l1)
A1.rank()
# =5

rr_forms1 = set()
last_cols1 = set()
sols1 = set()
for p in itertools.permutations(range(9)):
  E=A1.matrix_from_rows(p).echelon_form()
  if not E in rr_forms1:
    rr_forms1.add(E)
    last_cols1.add(tuple(sorted(E.column(4))))
    sols1.add(tuple(sorted(E.column(4)[4:])))

len(rr_forms1)
# =2520
len(last_cols1)
# =21
len(sols1)
# =4

sols1
# {(x1 + y0 + 1, x0 + y1, x0*y1 + y1, x1*y0, x0*y0 + x1*y1 + y1),
#  (x1 + y0 + 1, x0 + y1, x0*y1 + x0, x1*y0, x0*y0 + x1*y1 + x0),
#  (x1 + y0 + 1, x0*y1 + y1, x0*y1 + x0, x1*y0, x0*y0 + x0*y1 + x1*y1),
#  (x1 + y0 + 1, x1*y0, x0*y0 + x1*y1 + y1, x0*y0 + x1*y1 + x0, x0*y0 + x0*y1 + x1*y1)}

l2 = [1, 1, 1, 1,     1,
      0, 1, 0, 1,    x0,
      0, 0, 1, 1,    x1,
      1, 1, 0, 0,    y0,
      0, 1, 1, 1,    y1,
      0, 1, 0, 0, x0*y0,
      0, 1, 0, 1, x0*y1,
      0, 0, 0, 0, x1*y0,
      0, 0, 1, 1, x1*y1]
A2 = Matrix(P,9,5,l2)
A2.rank()
# =5

rr_forms2 = set()
last_cols2 = set()
sols2 = set()
for p in itertools.permutations(range(9)):
  E=A2.matrix_from_rows(p).echelon_form()
  if not E in rr_forms2:
    rr_forms2.add(E)
    last_cols2.add(tuple(sorted(E.column(4))))
    sols2.add(tuple(sorted(E.column(4)[4:])))

len(rr_forms2)
# =2880
len(last_cols2)
# =24
len(sols2)
# =4

sols2
# {(x1 + y0 + 1, x1*y1 + y0 + 1, x0*y1 + x0, x1*y0, x0*y0 + y0 + y1 + 1),
#  (x1 + y0 + 1, x1*y1 + x1, x0*y1 + x0, x1*y0, x0*y0 + x1 + y1),
#  (x1*y1 + y0 + 1, x1*y1 + x1, x0*y1 + x0, x1*y0, x0*y0 + x1*y1 + y1),
#  (x0*y1 + x0, x1*y0, x0*y0 + y0 + y1 + 1, x0*y0 + x1 + y1, x0*y0 + x1*y1 + y1)}

(the difference between A1 and A2 is just a single bit in the assignments I'm looking for)

As can be seen, there are over 2500 distinct outputs (of about 363k row permutations) this algorithm can produce for the two test matrices but only four possible outcomes (this can also vary, based on the assignments) for the equations I'm interested in.

My question now is:

Is there an easy (and preferrably efficient) way to enumerate the possible sets of equations without iterating over all possible permutations?

It would also be helpfull, to know the number of possible sets in advance.

In addition I was not yet able to figure out, what this specific form of resulting matrix is called or what a common name of this strategy is - so pointers in this direction would also be helpfull.

1

There are 1 best solutions below

0
On BEST ANSWER

While this is not strictly the answer I was originally looking for, this how I tackled my main question of enumerating all possible solutions.

I used a different represenation of the problem and a different solving strategy:

  • Instead of a matrix, I used an ANF boolean formula to describe all possible solutions: $a_1*x_1*y_1 + a_2*x_1*y_0 + a_3*x_0*y_1 + a_4*x_0*y_0 + a_5*y_1 + a_6*y_0 + a_7*x_1 + a_8*x_0 + a_9*1$ (with different assignments for $a_i \in \{0,1\}, i \in \{1..9\}$ it is possible to represent all possible outcomes of the formulas I'm looking for)
  • Now a system of equations is created for all "valid" assignments by setting the expected values from the table of valid assignments and equating them to $0$. For example, the euqation for the second row is $a_1*0*1 + a_2*0*1 + a_3*1*1 + a_4*1*1 + a_5*1 + a_6*1 + a_7*0 + a_8*1 + a_9*1 = 0$ or simplified $a_3 + a_4 + a_5 + a_6 + a_8 + a_9 = 0$
  • This system of equations is then fed into CryptoMiniSat, which supports xor expressions and can iterate over all possible solutions

The resulting CNF file looks like this:

c header: 9 variables and 5 clauses/equations
cnf 9 5
c all clauses have to be terminated by a 0
c XOR lines start with an 'x'. The first variable is negated because we equate to false, not true (which is the default)
x -6 9 0
x -3 4 5 6 8 9 0
x -7 9 0
x -1 3 5 7 8 9 0
c this last line is a normal or of all variables to disallow the trivial all zero solution
1 2 3 4 5 6 7 8 9 0

And the CryptoMiniSat output is:

# cryptominisat5 --verb 0 --maxsol 100 find_root_funcs.cnf 
s SATISFIABLE
v -1 2 -3 -4 -5 -6 -7 -8 -9 0
s SATISFIABLE
v -1 2 3 -4 5 -6 -7 -8 -9 0
s SATISFIABLE
v -1 -2 3 -4 5 -6 -7 -8 -9 0
s SATISFIABLE
v -1 2 -3 -4 5 -6 -7 8 -9 0
s SATISFIABLE
v -1 2 -3 -4 5 6 7 8 9 0
s SATISFIABLE
v -1 2 3 -4 -5 6 7 8 9 0
s SATISFIABLE
v -1 2 -3 -4 -5 6 7 -8 9 0
s SATISFIABLE
v -1 -2 -3 -4 -5 6 7 -8 9 0
s SATISFIABLE
v 1 -2 -3 4 5 6 7 -8 9 0
s SATISFIABLE
v -1 -2 3 -4 5 6 7 -8 9 0
s SATISFIABLE
v -1 2 3 -4 5 6 7 -8 9 0
s SATISFIABLE
v 1 2 -3 4 5 6 7 -8 9 0
s SATISFIABLE
v 1 2 3 4 -5 6 7 -8 9 0
s SATISFIABLE
v 1 -2 3 4 -5 6 7 -8 9 0
s SATISFIABLE
v 1 -2 3 4 -5 -6 -7 -8 -9 0
s SATISFIABLE
v 1 2 3 4 -5 -6 -7 -8 -9 0
s SATISFIABLE
v 1 2 -3 4 5 -6 -7 -8 -9 0
s SATISFIABLE
v 1 -2 -3 4 5 -6 -7 -8 -9 0
s SATISFIABLE
v 1 -2 3 4 5 6 7 8 9 0
s SATISFIABLE
v 1 2 3 4 5 6 7 8 9 0
s SATISFIABLE
v 1 2 -3 4 -5 6 7 8 9 0
s SATISFIABLE
v 1 -2 -3 4 -5 6 7 8 9 0
s SATISFIABLE
v -1 -2 3 -4 -5 6 7 8 9 0
s SATISFIABLE
v -1 -2 -3 -4 5 6 7 8 9 0
s SATISFIABLE
v -1 -2 -3 -4 5 -6 -7 8 -9 0
s SATISFIABLE
v -1 -2 3 -4 -5 -6 -7 8 -9 0
s SATISFIABLE
v 1 -2 3 4 5 -6 -7 8 -9 0
s SATISFIABLE
v 1 2 3 4 5 -6 -7 8 -9 0
s SATISFIABLE
v 1 2 -3 4 -5 -6 -7 8 -9 0
s SATISFIABLE
v 1 -2 -3 4 -5 -6 -7 8 -9 0
s SATISFIABLE
v -1 2 3 -4 -5 -6 -7 8 -9 0
s UNSATISFIABLE

These are all possible solutions (functions), which satisfy the condition that they have a root (equate to $0$) at all four valid assignments.

The first result line v -1 2 -3 -4 -5 -6 -7 -8 -9 0 is read as $a_1=0; a_2=1; a_3=0; a_4=0; a_5=0; a_6=0; a_7=0; a_8=0; a_9=0$, which is the function $x_1*y_0$ (which was also in the result set from the matrix method).

The s UNSATISFIABLE in the last line indicates, that all possible solutions have been exhausted.

Note that this method also found a lot more equations, than the matrix reduction and was much faster!