Intro
I'm implementing a variant called PCG [1] of a Linear-Congruential Generator (LCG), which is an extremely simple pseudo-random number generator (PRNG) governed by the equation
$$S_{n+1} = aS_{n} + c \pmod{m}$$
where $S_i$ is the state $i$ calls of next() after 0, and $a, c$ and $m$ are parameters that together determine the quality of the LCG. Full-period (order-$m$) LCGs occur iff:
- $\gcd(m,c)=1$
- $a-1$ divisible by all prime factors of $m$
- $a-1$ divisible by 4 if $m$ divisible by 4
See [2] for more details about LCGs.
Distance
The author of PCG claims that PCG offers the useful features of "jump-ahead" and "distance" [3]. I suspect that jump-ahead refers to the ability to skip "efficiently" $n$ calls to next() (where "efficiently" usually means $O(\log n)$ or $O(\log m)$ time); This is borne out by the fact that PCG, LCG and ChaCha20 (counter mode) are listed as having that property, but the RC4-based Arc4Random doesn't and therefore isn't listed as having that property. Skip-ahead in LCGs can be easily done by exploiting recursively the equation
$$S_{n+2} = a(aS_{n} + c) + c = \underbrace{a^2}_\textrm{new $a$} S_{n} + \underbrace{(a+1)c}_\textrm{new $c$} \pmod{m}$$
in a binary-exponentiation-type loop. That leaves, however, the property of "distance". I strongly suspect that this refers to the ability to compute "efficiently", given the current states $S_i$ and $S_j$ of two generators, the distance $j-i$, which is the number of calls to next() that bring a state $S_i$ to a state $S_j$ (again, "efficiently" usually means here $O(\log (j-i))$ or $O(\log m)$ time). The author claims PCG, LCG and counter-mode ciphers like ChaCha20 have it, but Xorshift and the rest do not.
My question lies here.
How do we compute the distance between two LCG states?
For counter-mode cipher PRNGs, the distance between two states is a simple subtraction of the numerical value of the states, since advancing a state is just an increment by 1 of the state variable. All the complexity of the PRNG is in the output function.
But how is it done for LCGs? As far as I understand, computing the difference between two states can be reduced to the problem of computing the distance $d(S_j, S_i)$ between the two states and the zero-state $S_0 = 0$:
$$d(S_j, S_i) = d(S_j, 0) - d(S_i, 0) = j-i$$
We know that the closed-form solution for skip-ahead by $k$ steps in LCGs is
$$S_{n+k} = a^k S_n + \frac{a^{k-1}}{a-1}c \pmod{m}$$
, and when $S_0 = 0$ this simplifies to
$$S_{k} = \frac{a^{k-1}}{a-1}c \pmod{m}$$
Suppose that we now have some state $S_n$ of unknown $n$ and we wish to determine $n$. After some algebra, we get
$$ \begin{align*} \frac{a-1}{c}S_{n} &= a^{n-1} \pmod{m} \\ C &= a^E \pmod{m} \end{align*} $$
which is nothing but the discrete logarithm problem, which remains famously unsolved! So how does one, in general, compute the distance $d(S_j, S_i)$ faster than brute force?
And yet...
I've found a $O(\log n)$ solution that appears to work at least in the special case of PCG, which uses $m=2^{32}$ or $2^{64}$, $a = 1 \pmod{4}$, odd $c$ and is full-period.
It's based on the fact that PCG always alternates between odd and even LCG states.
- If $S_i$ is even but $S_j$ is odd or vice-versa, then the lowest bit of $i-j$ is 1, otherwise it's 0. If it is 1, I advance the state $S_i$ by one, otherwise I leave it untouched. I then set
c = (a+1)*canda = a^2. - $S_i$ and $S_j$ are now congruent modulo 2. I examine now the second rightmost bit in both $S_i$ and $S_j$: If they mismatch, I advance the state $S_i$ using the current $a$ and $c$, otherwise I leave it untouched. I then again set
c = (a+1)*canda = a^2. - $S_i$ and $S_j$ are now congruent modulo 4. I examine now the third rightmost bit in both $S_i$ and $S_j$: If they mismatch, I advance the state $S_i$ using the current $a$ and $c$, otherwise I leave it untouched. I then again set
c = (a+1)*canda = a^2. - ... Repeat until $S_i = S_j$.
This ad-hoc algorithm appears to work for every case I've thrown at it so far. But why, when the discrete logarithm has such a reputation for difficulty?
uint64_t lcg64Diff(const LCG64* Ss, const LCG64* Se){
uint64_t a = LCG64_a,
c = LCG64_c,
p = 1,
Z = Ss->S,
D = 0;
while(Z != Se->S){
if((Z^Se->S) & p){
Z = a*Z + c;
D += p;
}
c *= a+1;
a *= a;
p <<= 1;
}
return D;
}
Very nice writeup! I agree your algorithm is correct for a full-period LCG with a power-of-2 modulus.
I just want to add that the difficulty of the general discrete logarithm problem shouldn't cause real doubt. That problem is also much easier to solve if the group order is "smooth" (has only small prime factors), and especially easy if the order is the power of a single small prime. For example, see the Pohlig-Hellman algorithm
However, because we're working modulo a power of 2, the connection to discrete logs is weak: we're not working with a cyclic group (there is no generator). For example, picture a generator with $a = 5$ and $m = 8$.
All integer powers of 5 are congruent to either 1 or 5 modulo 8. The chain of implications in your reasoning becomes one-way at the point you multiply both sides by $a-1$. Since $a$ is congruent to 1 modulo 4, $gcd(a-1,m)$ is at least 4 in general, so $a-1$ has no multiplicative inverse modulo $m$. There isn't a unique answer to the derived discrete log problems.
By the way, there's a repeated typo in your closed-form expressions: $a^{k-1}$ in the numerator should be $a^k-1$ instead.
So repair that, and work out the details for the $a=5$ and $m=8$ example exhaustively (and, say, pick $c=3$ - any odd integer will do). That will show you where and how the analogy with discrete logs gets distorted.
Fleshing it out
For $a=5$, $c=3$, $m=8$,
$$S_{n} = \frac{a^n-1}{a-1}c \pmod{m}$$
becomes
$$S_{n} = \frac{5^n-1}{4}3 \pmod{8} [ORIGINAL]$$
3 is its own multiplicative inverse mod 8, so multiplying both sides by 3 gives (and this step is invertible):
$$3S_{n} = \frac{5^n-1}{4} \pmod{8}$$
However, multiplying by 4 is not invertible, because (as explained earlier) the even $a-1$ has no multiplicative inverse modulo any power-of-2 $m$:
$$12S_{n} = 4S_{n} = 5^n-1 \pmod{8}$$
Finally, adding 1,
$$4S_{n}+1 = 5^n \pmod{8}[FINAL]$$
And there's the rub. While there are 8 possible interesting values for $S_{n}$, the left hand side always reduces to 1 or 5 modulo 8, matching that all powers of 5 are congruent to 1 or 5 modulo 8. So while $[FINAL]$ holds true for all $S_{n}$, it's not actually of much use for discovering the $n$ needed to make $[ORIGINAL]$ true.