How to solve for a variable in an equation that involves XOR?

1.3k Views Asked by At

I was recently introduced to XOR and other bitwise operators while reading up some articles on C++. The concept seems rather simple but can be confusing because it involves visualizing numbers in binary. I ran into an equation that involves XOR sometime while I was looking up some real-world examples people use for encryption.

I was baffled at the equation, and wanted to learn how to solve it.

Here is the equation:

$y = (x * (i+A)+B) \oplus (x*i+C)$

I am trying to solve for x in the equation, with all other variables being known at the time of the solve. Variables A, B, and C remain constant at all times, with y, x, and i changing. This can simplify the equation to:

$y = (x * (i+32757935)-29408451) \oplus (x*i-5512095)$

I would like to learn how to solve this equation for x, and many others similar to this in the future as well. As such, I want to have some pointers on how to solve it. I remember reading up that splitting the equation into a system of linear equations in modulo 2 is the way to go, but I do not know how to do that. I've read up about the rules about XOR and such, but I'm not sure how I should go about creating a system, and solving it.

Note:

  • All variables are unsigned 32-bit integers
  • $\oplus$ is the XOR operation
2

There are 2 best solutions below

0
On

We know that

  1. $A\oplus B=B\oplus A$
  2. $A\oplus(B\oplus C)=(A\oplus B)\oplus C$
  3. $A\oplus A=0$
  4. $A\oplus 0=A$

However, what are lacking are rules for dealing with

  1. $A\oplus(B+C)$
  2. $A\oplus(B*C)$

We can manipulate the equation as follows:

\begin{eqnarray} y &=& (x * (i+A)+B) \oplus (x*i+C)\\ y\oplus (x*i+C) &=& x * (i+A)+B\\ y\oplus (x*i+C)-B &=& x * (i+A)\\ x&=&\frac{y\oplus (x*i+C)-B}{i+A} \end{eqnarray}

This opens the possibility of using a recursion which hopefully converges to a solution for $x$:

\begin{equation} x_{n+1}=\frac{y\oplus (x_n*i+C)-B}{i+A} \end{equation}

4
On

For reasons that @JohnWaylandBales pointed out, you're unlikely to find a closed form for $x$ in terms of your other variables. The recursion approach suggested in that answer is interesting, but it's also not clear why such an approach should work (and I suspect in many cases, it will not; for instance, probably for most values of $x_n$, the RHS will not be an integer).

We could still hope for a simple and efficient method/algorithm to solve for $x$ given the values of the other variables. This is slightly confounded by the issue that there might be multiple (or even infinitely many) values of $x$ which satisfy your equation. Even worse, even if we restrict looking at $x$ of a certain size (e.g. 32-bit $x$), there still may be exponentially many possible $x$ (a fun example to think about is the equation $x \oplus 2x = 3x$).

I don't know of any provably efficient method to find such an $x$, nor do I know any result that says it is hard (perhaps there is some way to embed an NP-complete problem into equations of this form). That said, here is a heuristic approach which will definitely find all $B$-bit solutions $x$ to your equation, and possibly spend much less time than the brute force solution.

The trick is to notice that both operators $+$ and $\oplus$ satisfy the following property for any positive integer $k$:

$$((a \bmod 2^{k}) + (b \bmod 2^{k})) \bmod 2^{k} = (a+b) \bmod 2^{k}$$

$$((a \bmod 2^{k}) \oplus (b \bmod 2^{k})) = (a\oplus b) \bmod 2^{k}$$

This lets us build up a solution to your equation, starting modulo 2, then progressing modulo 4, modulo 8, until finally we have solutions modulo $2^{B}$ (which correspond to $B$-bit solutions). For example, let's look at the equation

$$(2x+5)\oplus(3x+4) = 56 $$

We'll try to find all 6-bit solutions $x$ to this equation (i.e. all $x \in [0, 63]$ that satisfy this equation).

We begin by looking modulo 2. Modulo 2 $\oplus$ is just $+$, so our equation reduces to

$$(2x + 5) + (3x + 4) \equiv 62 \bmod 2$$

or equivalently

$$x + 1 \equiv 0 \bmod 2$$

It follows from this that any solution $x$ must be odd, i.e. $x \equiv 1 \bmod 2$. Proceeding modulo $4$, we have that $x$ must satisfy (modulo 4):

$$(2x + 1) \oplus 3x = 2$$

We know $x$ is odd, so there are two values to try mod 4, namely $1$ and $3$. We find that $1$ does not work ($3 \oplus 3 = 0$) but $3$ does ($7 \oplus 9 = 14 \equiv 2 \bmod 4$). So now we know that $x$ is $3$ mod $4$.

Trying both $x=3$ and $x=7$, we find that $x$ must be $3$ mod $8$. Continuing, we find that $x$ must be $11$ mod $16$, $11$ mod $32$, and $11$ mod $64$; so the only possible $x$ in our range that can work is $x=11$, and we can check that it does in fact work.

This procedure seems to work quite well in most cases; at each step you usually only have to check $2$ possible values modulo $2^k$. Of course, there are cases where both solutions mod $2^{k}$ will work, and this can lead to exponential branching. For example, $x \oplus x = 0$ inevitably leads to exponential branching, since every $x$ is a solution. More subtly, things like $(64x + 1) \oplus (192x + 2) = 3$ will also lead to a lot of branching even though the only solution is $x=0$, because this equation reduces to $1 \oplus 2 = 3$ modulo $2^k$ for $k$ up to $6$. I think it might be possible to prove some general characterization of when it does work, in terms of powers of $2$ dividing $i$, $A$, and $i+A$, but I haven't been able to work it out yet.

Hopefully this helps!