What's the difference between these two implementations of Pollard's rho?

Question

What's the difference between these two implementations of Pollard's rho?

1.3k Views Asked by Bumbble Comm At 03 Apr 2026 - 12:10

This question is a follow-up to this one. I began an investigation towards the necessity of the polynomial in Pollard's rho.

Apparently any source of randomness can be used, so I decided to see what happens if I take the polynomial out. So I wrote the procedure rho_random. (It is Pollard's rho with Python's random source of random numbers. Other than that, it is just like the second procedure here, which is the implementation you can find on Menezes' Handbook of Applied Cryptography.)

First the results.

>>> rho_random(100001220001957)
{'chances': 13440909, 'factor': 10000103, 'divisions': 381526100}

>>> menezes_counted(100001220001957)
{'factor': 10000103, 'divisions': 93512}

The implementation with the polynomial $x^2 + 1$ is much faster. The question is why. (Notice my intention is to make both procedures different only by the source of randomness.)

def rho_random(N):
  chances = 0
  chances_max = 2*int(sqrt(N))
  divisions = 0
  random.seed(1)
  def random_mod_N(): 
    return random.randint(1, N - 1)
  while True:
    if chances_max <= chances:
      return "unlucky"
    a = random_mod_N()
    b = random_mod_N()
    d = my_gcd(abs(a - b), N)
    chances = chances + 1
    divisions = divisions + d["divisions"]
    if d["gcd"] == N:
      return "trivial factor"
    if 1 < d["gcd"] and d["gcd"] < N:
      return {"factor": d["gcd"], "divisions": divisions, "chances": chances}

def menezes_counted(N):
  divisions = 0
  def f(x): return (x**2 + 1) % N
  a = 2
  b = 2
  while True:
    a = f(a)
    b = f(f(b))
    d = my_gcd(abs(a - b), N)
    divisions = divisions + d["divisions"]
    if d["gcd"] == N:
      return "trivial factor"
    if 1 < d["gcd"] and d["gcd"] < N:
      return {"factor": d["gcd"], "divisions": divisions}

I wrote my own gcd for counting.

def my_gcd(a, b):
  def xgcd(a, b, t):
    if b == 0:
      return {"gcd":a, "divisions": t}
    return xgcd(b, a % b, t + 1)
  return xgcd(a, b, 0)

EDIT 1. One thing I notice which is different between rho_random and menezes_counted is that menezes_counted uses Floyd's algorithm while rho_random considers only adjacent random numbers. I don't see a problem with that because both sources of randomness are assumed to be random, so any two numbers at all are equally likely to give us a collision. So if I'm picking adjacent numbers or numbers far apart, my chance is the same.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2017-12-20 00:58:07

There are a couple of things to talk about here. Throughout, $N$ is a composite number to be factored, $p$ is its smallest prime factor.

Pollard's method

In A Monte Carlo Method for Factorization, Pollard describes a method of factorization using iterates of a polynomial to accumulate factors into a highly composite $Q$ that it is hoped will eventually share a factor with $N$. \begin{align*} x_0 &= 2 & Q_0 &= 1 \\ x_{i} &= x_{i-1}^2-1 \pmod{N} & Q_{i} &= Q_{i-1} (x_{2j} - x_j) \pmod{N} \\ d_i &= \gcd(Q_i,N) \end{align*} Pollard tracks triples $(x_i, x_{2i}, Q_i)$, since from one of these, the next one can be obtained by one multiply for the first entry, two multiplies for the second entry, and one multiple for the third entry, then a $\gcd$ computation. This is slightly slower than five multiples because $\gcd$ computations are about $\log n$ slower than multiplies of length $n$ numbers.

Pollard goes on to observe that $x_n$ is periodic (because it can take at most $N$ values, but even better, we can consider it modulo $p$ and only have to find a collision among its $p$ distinct values). We writes $c$ for the period of the $x_n$ and $t$ for the length of the (possibly empty) pre-periodic part. (This idea of a straight segment merging into a cycle is the source of the name "$\rho$ method".) Since there is a cycle, there is an $r$ such that $x_{2r} = x_r \pmod{p}$, which will be easy to find using the triples above. It turns out that $r$ is usually a multiple of the period, $c$. (Detecting a cycle by $x_{2r} = x_r$ is Floyd's cycle-finding method.)

Since we are factoring, we don't actually know $p$, so we can think of $r$ as a function of $p$. Supposing $p$ were the smallest prime factor of $N$, what value of $r = r(p)$ would give us $x_{2r} = x_r \pmod{p}$. This is a useful thing because after $r(p)$ steps, i.e., for $i > r(p)$, $Q_i \cong 0 \pmod{p}$, and we find the factor $p$ when we compute the $\gcd$ for $d_i$. (Generally, we might actually have several primes with the same $r(p)$ all of which divide $N$ and so we get their product in the $\gcd$. So we should verify $d_i$ is prime, otherwise continue factoring $d_i$ and then $N/d_i$.)

If we knew $c(p)$ and $t(p)$ for our choice of polynomial and starting value and for every prime dividing $N$, we would know that $d_{c(p) + t(p)}$ contains a nontrivial divisor of $N$. However, we do not know which $p$ divide $N$, so we iterate triples until $i=r$ to find a collision.

Pollard gathers statistics on expectation values of $c(p)$, $t(p)$, and r(p) for various choices of $p$. Under the random function assumption, he estimates these to be \begin{align*} \mathbb{E}(c(p)) = \mathbb{E}(t(p)) = \sqrt{\pi p /8} &= 0.6267\dots \sqrt{p} \\ \mathbb{E}(r(p)) = \pi^{5/2}\sqrt{p/(12 \sqrt{2})} &= 1.0308\dots \sqrt{p} \text{.} \end{align*} Then he presents data for the 100 largest primes less than $10^6$. For these, the mean of $c(p)/\sqrt{p}$, $t(p)/\sqrt{p}$, and $r(p)/\sqrt{p}$ was $0.6127, 0.6821, 1.0780$, which are certainly in the neighbourhood of the expectations listed above. He also presents data about the dispersion of the values of $r(p)$.

Notice the goal is to estimate how big $i$ must be to make $p$ divide $Q_i$.

Brent's Improvement

In An Improved Monte Carlo Factorization Algorithm Brent presents a different method for computing the period of $x_n$. In section 4, he computes that his cycle finder completes after $64.12\dots\%$ as many iterations as Floyd's, with about half as much variance.

Pollard detects a cycle by having a nontrivial $\gcd$ appear in $d_i$. Brent replaces his termination criterion, that the $j^\text{th}$ and $k^\text{th}$ iterates are equal, $x = y$, with the computation of $d_i$ (since this can occur much earlier than equality of the iterates modulo $N$). Hw observes that one need not compute all of the $d_i$ -- one can avoid many $\gcd$s by computing the $Q_i$ and only finding $d_i$ when $i$ is a multiple of some number, but not too large a "some number" otherwise, we risk having $Q_i$ be a multiple of $N$. If $Q_i \pmod{N} \cong 0$, then backtrack to the previous $x,y$ and compute $\gcd$s for each iterate. Brent find that this only speeds up the computation by about 4%.

Brent then explains that a noticeable speedup occurs if we do not accumulate $x_r - x_k$ where the $k$ is small compared to $r$. The idea is that the exciting factors in $Q_i$ appear once the $x_n$ start cycling. IF $x_k$ is still in the pre-periodic tail of the $\rho$, $Q_i$ isn't picking up multiples of $p$ yet. Making this change, the expected number of iterations of the polynomial drops from $3.9655\dots \sqrt{p}$ to $3.1225\dots \sqrt{p}$, a 24% reduction.

In section 8, Brent then collects statistics on the number of multiplications required to find $p$ using his cycle finder for various $p < 10^8$. Note that all that is needed for this is to know $f(x) = x^2 + 3$, $x_0 = 0$, $p$, and $m = 1$. The last indicates that every $\gcd$ will be calculated so as soon as $d_i$ detects $p$, the run stops. There is no $N$ in this calculation. He finds the mean and maximum number of multiplies for this range of $p$ and these choices of $f$ and $x_0$ are noticeably less than those using Floyd's cycle detection and are quite close to the estimates from sections 6 and 7.

Surveys over $c$ and $x_0$

Bach, in Toward a Theory of Pollard's Rho Method studies varying the $c$ in $f(x) = x^2 + c$ and varying the $x_0$. He shows, without heuristics that the expected number of steps to find the factor $p$ should scale as $\sqrt{p}$. (In fact, he finds slightly sharper estimates.)

Discussion

Your calculation is not apples-to-apples with the statistics collected by Pollard and Brent. Your rho_random selects two residues, a and b, modulo $N$ uniformly, then hopes to find a nontrivial factor of $N$ via $gcd(|a-b|,N)>1$. The probability that, for $p \mid N$, $p \min |a-b|$ is $1/p$, so the probability of a useless pair of a,b is $$ U(N) = \prod_{p \mid N} \frac{p-1}{p} $$ and the probability of $k$ useless pairs in a row is $U(N)^k$. We expect a 50-50 chance of having found a useful pair once $k$ is large enough that $U(N)^k < \frac{1}{2}$. (This depends on the assumption throughout this subject -- a and b really are uniformly distributed, just like $f$ is a random function.) For $N \in [1,10^5]$, I find $U(N)$ averages $0.607929\dots$ and so we have a $p$ once $k \approx 1.39271\dots$. For $N \in [10^5, 10^6]$, I see the same summary statistics for $U(N)$. What's going on?

For $p = 2$, $\gcd(|a-b|,N)$ finds $p$ (as long as $a \neq b$) if $a$ and $b$ are both even or are both odd, so half the time. One-third of the time, we find $p=3$, one-fifth, $p=5$, and so on. In other words, in searching through intervals, the largest effect is that half of the $N$ are divisible by $2$ and we find that $p$ quickly, one-third by $3$ and again we find that one quickly. The slow ones come from semiprimes, $N = p q$ where $p$ and $q$ are about the same size. In fact, $U(101 \times 103) = 0.980486\dots$ and we expect to have to generate $35$ pairs before we have a 50-50 chance of having found either prime factor. For $N = p q$, $U(N) = (1 - \frac{1}{p})(1 - \frac{1}{q})$, which is dominated by $1 - \frac{1}{p}$. Thus, we learcn about the speed of rho_random by studying how many pairs are required to find $p$, the smallest prime factor of $N$, not by surveying factorizations in ranges, which are dominated by the numbers with $p = 2$.

Stealing from Brent, we expect menezes_counted to require $4.1232 \sqrt{p}$ multiplies to arrive at $i = r(p)$. For $p = 2$, this is $5.83109\dots$ to discover $x_2 \cong x_4$. For $p = 101$, the other example above, we expect $41.4376\dots$. For this many pairs in random_rho, $U(101 \times 103)^{41.4376} = 0.441936\dots$, so we only have a 56% chance of factoring this with so few pairs.

Note that we don't have to do either survey using values of $N$. For rho_random, generate random pairs and keep track of the first few dozen primes generated by the pairs. For each such prime keep track of how many pairs until the first occurrence or how many pairs since the previous occurrence for each instance of a prime. This will let you build up statistics for how many pairs you have to generate to reach the first factor of $p$, on average. The theoretical expectation is $\sqrt{p}$. (The likelihood that uniformly distributed b is in the same congruence class modulo $p$ as a is $1/p$. THe linkelihood of missing repeatedly is the birthday paradox calculation, so the probably number of tries is $\sqrt{p}$.) For each prime, $p$, you find this way, you can run menezes_counted(p) until, instead of "d = my_gcd(abs(a - b), N)", "mod(abs(a-b),p)==0". That is, run menezes_counted until it finds the prime $p$. This should take $r(p)$ a-b pairs, which, as explained above should be a few times $\sqrt{p}$

If we are to do an apples-to-apples comparison, we should track the expensive steps in both algorithms: multiplications and $\gcd$s. And since we track multiplications in $f$, we should track multiplications in rand(). Python's rand uses Mersenne Twister. So rho_random generating $a$ and $b$ uses two "multiplies" and computing their $\gcd$ with $N$ uses one more. And menezes_counted generating a and b uses three "multiplies" and the $\gcd$ uses one more. Counting the individual divisions in the $\gcd$ isn't quite right -- these should efficiently be done using binary shifting, so should be about as fast as adding, not as slow as multiplying.

What's the difference between these two implementations of Pollard's rho?

There are 1 best solutions below

Pollard's method

Brent's Improvement

Surveys over $c$ and $x_0$

Discussion

Related Questions in FACTORING

Related Questions in PRIME-FACTORIZATION

Trending Questions

Popular # Hahtags

Popular Questions