Learning basic number theory, I wrote a simple program to factorise integers by trial division. The next task was to learn and implement Pollard rho algorithm (hopefully, order(s) of magnitude faster than trial division). Surprisingly, my pollard-rho program is hardly faster than the trial-division one, and just "hangs" (runs for more hours than I can possibly wait) factorising a repunit (31 ones):
1111111111111111111111111111111 = 2791 • 6943319 • 1996182842661949511
which trial-division manages to factor in reasonable time (18 seconds for a C program, 11 minutes for Python).
My question is, is my implementation wrong or blatantly ineffective, or what else am I missing?
I admit I definitely don't understand the algorithm well enough, especially the "loop detection" part or the part when the algorithm "returns an error" (xi == xj for some i!=j).
I bluntly implemented a ready algorithm from https://en.wikipedia.org/wiki/Pollard's_rho_algorithm :
x ← 2, y ← 2, d ← 1
while d = 1:
x ← g(x)
y ← g(g(y))
d ← gcd(|x - y|, n)
if d = n:
return failure
else:
return d
Here is my Python implementation:
import sys
def gcd(a, b): # assumes a >= 0
if b < 0:
b = -b
if a < b:
a, b = b, a
if a == 0:
return float('NaN') # gcd(0, 0)
while b > 0:
a, b = b, a % b
return a
def pollard_rho(n):
x = 2
y = 2
d = 1
while (d == 1):
x = (x * x + 1) % n
y = (y * y + 1) % n
y = (y * y + 1) % n
d = gcd(n, x - y)
if d == n:
return float('NaN') # signal an error
else:
return d
if len(sys.argv) < 2:
sys.exit(f"Usage: {sys.argv[0]} <number_to_factor>")
n = int(sys.argv[1])
dividend = n
divisors = []
while dividend > 1:
divisor = pollard_rho(dividend)
if divisor != divisor: # pollard_rho() returned an error
divisors.append('?' + str(dividend) + '?') # god knows if dividend is prime?
break
else:
divisors.append(divisor) # may not be prime, I don't care at this stage
dividend //= divisor
print(n, "=", ' * '.join(map(str, divisors)))
```