I'm currently experimenting with finding systems of multivariate equations, matching a given set of inputs with their roots.
I'll explain with a toy example:
Given four "valid" assignments of two inputs $x_0$ and $x_1$ and two outputs $y_0$ and $y_1$
| $x_0$ | $x_1$ | $y_0$ | $y_1$ |
|---|---|---|---|
| $0$ | $0$ | $1$ | $0$ |
| $1$ | $0$ | $1$ | $1$ |
| $0$ | $1$ | $0$ | $0$ |
| $1$ | $1$ | $0$ | $1$ |
I want to find a set of equations, which has these four assignments as roots (and possibly more). If these assignments are represented in a matrix like this $$\begin{pmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 0 & 1 & x_0 \\ 0 & 0 & 1 & 1 & x_1 \\ 1 & 1 & 0 & 0 & y_0 \\ 0 & 1 & 0 & 1 & y_1 \\ 0 & 1 & 0 & 0 & x_0*y_0 \\ 0 & 1 & 0 & 1 & x_0*y_1 \\ 0 & 0 & 0 & 0 & x_1*y_0 \\ 0 & 0 & 0 & 1 & x_1*y_1 \\ \end{pmatrix}$$ it can be brought into (row reduced) echelon form (at least as far as possible - the bold entries are not in echelon form). $$\begin{pmatrix} 1 & 0 & 0 & 0 & x_0*y_0 + x_1 + 1 \\ 0 & 1 & 0 & 0 & x_0*y_0 \\ 0 & 0 & 1 & 0 & x_0*y_0 + x_0 + x_1 \\ 0 & 0 & 0 & 1 & x_0*y_0 + x_0 \\ 0 & 0 & 0 & 0 & \mathbf{x_0 + y_1} \\ 0 & 0 & 0 & 0 & \mathbf{x_1 + y_0 + 1} \\ 0 & 0 & 0 & 0 & \mathbf{x_0*y_1 + x_0} \\ 0 & 0 & 0 & 0 & \mathbf{x_1*y_0} \\ 0 & 0 & 0 & 0 & \mathbf{x_0*y_0 + x_1*y_1 + x_0} \\ \end{pmatrix}$$ The bold last five entries in the last column of this form are, what I'm interested in, as they have roots at the assignments I'm interested in.
Unfortunatelly, it turns out, that these are not the only possible solutions to my problem and with a different ordering of rows, the algorithm can return different sets of equations.
I have implemented this in sagemath:
P.<x0,x1,y0,y1> = PolynomialRing(GF(2), 4)
l1 = [1, 1, 1, 1, 1,
0, 1, 0, 1, x0,
0, 0, 1, 1, x1,
1, 1, 0, 0, y0,
0, 1, 0, 1, y1,
0, 1, 0, 0, x0*y0,
0, 1, 0, 1, x0*y1,
0, 0, 0, 0, x1*y0,
0, 0, 0, 1, x1*y1]
A1 = Matrix(P,9,5,l1)
A1.rank()
# =5
rr_forms1 = set()
last_cols1 = set()
sols1 = set()
for p in itertools.permutations(range(9)):
E=A1.matrix_from_rows(p).echelon_form()
if not E in rr_forms1:
rr_forms1.add(E)
last_cols1.add(tuple(sorted(E.column(4))))
sols1.add(tuple(sorted(E.column(4)[4:])))
len(rr_forms1)
# =2520
len(last_cols1)
# =21
len(sols1)
# =4
sols1
# {(x1 + y0 + 1, x0 + y1, x0*y1 + y1, x1*y0, x0*y0 + x1*y1 + y1),
# (x1 + y0 + 1, x0 + y1, x0*y1 + x0, x1*y0, x0*y0 + x1*y1 + x0),
# (x1 + y0 + 1, x0*y1 + y1, x0*y1 + x0, x1*y0, x0*y0 + x0*y1 + x1*y1),
# (x1 + y0 + 1, x1*y0, x0*y0 + x1*y1 + y1, x0*y0 + x1*y1 + x0, x0*y0 + x0*y1 + x1*y1)}
l2 = [1, 1, 1, 1, 1,
0, 1, 0, 1, x0,
0, 0, 1, 1, x1,
1, 1, 0, 0, y0,
0, 1, 1, 1, y1,
0, 1, 0, 0, x0*y0,
0, 1, 0, 1, x0*y1,
0, 0, 0, 0, x1*y0,
0, 0, 1, 1, x1*y1]
A2 = Matrix(P,9,5,l2)
A2.rank()
# =5
rr_forms2 = set()
last_cols2 = set()
sols2 = set()
for p in itertools.permutations(range(9)):
E=A2.matrix_from_rows(p).echelon_form()
if not E in rr_forms2:
rr_forms2.add(E)
last_cols2.add(tuple(sorted(E.column(4))))
sols2.add(tuple(sorted(E.column(4)[4:])))
len(rr_forms2)
# =2880
len(last_cols2)
# =24
len(sols2)
# =4
sols2
# {(x1 + y0 + 1, x1*y1 + y0 + 1, x0*y1 + x0, x1*y0, x0*y0 + y0 + y1 + 1),
# (x1 + y0 + 1, x1*y1 + x1, x0*y1 + x0, x1*y0, x0*y0 + x1 + y1),
# (x1*y1 + y0 + 1, x1*y1 + x1, x0*y1 + x0, x1*y0, x0*y0 + x1*y1 + y1),
# (x0*y1 + x0, x1*y0, x0*y0 + y0 + y1 + 1, x0*y0 + x1 + y1, x0*y0 + x1*y1 + y1)}
(the difference between A1 and A2 is just a single bit in the assignments I'm looking for)
As can be seen, there are over 2500 distinct outputs (of about 363k row permutations) this algorithm can produce for the two test matrices but only four possible outcomes (this can also vary, based on the assignments) for the equations I'm interested in.
My question now is:
Is there an easy (and preferrably efficient) way to enumerate the possible sets of equations without iterating over all possible permutations?
It would also be helpfull, to know the number of possible sets in advance.
In addition I was not yet able to figure out, what this specific form of resulting matrix is called or what a common name of this strategy is - so pointers in this direction would also be helpfull.
While this is not strictly the answer I was originally looking for, this how I tackled my main question of enumerating all possible solutions.
I used a different represenation of the problem and a different solving strategy:
The resulting CNF file looks like this:
And the CryptoMiniSat output is:
These are all possible solutions (functions), which satisfy the condition that they have a root (equate to $0$) at all four valid assignments.
The first result line
v -1 2 -3 -4 -5 -6 -7 -8 -9 0is read as $a_1=0; a_2=1; a_3=0; a_4=0; a_5=0; a_6=0; a_7=0; a_8=0; a_9=0$, which is the function $x_1*y_0$ (which was also in the result set from the matrix method).The
s UNSATISFIABLEin the last line indicates, that all possible solutions have been exhausted.Note that this method also found a lot more equations, than the matrix reduction and was much faster!