I've recently learned about Pollard's rho factorization algorithm, and the way I understand it, it works by using a pseudorandom sequence of integers $x_{k}$ ranging from $0$ to $n-1$, where $n$ is the number to be factored, to find repetitions in an unknown sequence $y_{k} = x_{k} \bmod p$, where $p$ is an unknown, non-trivial factor of $n$. Such a repetition $y_{i}=y_{j}$ occurs when $x_{i}\equiv x_{j} \bmod p$, which can be detected by checking whether $\gcd(x_{i}-x_{j}, n)\gt 1$.
From what I could find on the Internet, Floyd's cycle detection algorithm or a variant thereof is used to find such repetitions. And the reason for this is what I don't understand.
- First, Floyd's algorithm requires that each element in the sequence depends solely on the element before it, which, even if it is the case for $x_{k}$, is not necessarily the case for $y_{k}=x_{k}\bmod p$. For example, assuming we want to factorize $6$, let's take the sequence $5, 3, 2, 4, 1, 0, 5, 3, 2, 4, 1, 0,\ldots$. The cycle period is $5, 3, 2, 4, 1, 0$, where every element is contained only once. However, if we take this sequence $\bmod 2$, we get $1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0,\ldots$. Floyd's algorithm cannot be applied here to find the cycle period $1, 1, 0, 0, 1, 0$.
- Besides, we are not even interested in a cycle, only in repetitions. Whether we find the congruence $\bmod 2$ between $5$ and $3$, $5$ and $1$, or $4$ and $2$ in the first sequence is irrelevant.
Please correct me if my understanding of Pollard's rho algorithm as described above is wrong. But if it is not, it seems to me that one could just as well try out some random numbers between $2$ and $n-1$ and see if they share a divisor with $n$ other than $1$. So what is the benefit of using a cycle detection method like Floyd's algorithm?
I also found this related question, but it is not answered, and the comment on the question does not really answer the question either.
First, the algorithm uses "pseudorandom" sequences of the form $x_{k+1}=f(x_k)$ (not of any other form). This is done so to (try to) utilize the birthday paradox: if $f$ is chosen randomly, then the expected period length is $O(\sqrt{n})$, and one can expect a similar behaviour when the choice of $f$ is reasonably restricted.
Second, these restrictions on $f$ include $f(x\bmod d)=f(x)\bmod d$ for any $d\mid n$ (and are usually satisfied just by further restricting $f$ to be a polynomial, saying the least). This is done to have $y_{k+1}=g(y_k)$ for $y_k=x_k\bmod p$, where $p$ is (any, but let's consider) the smallest prime factor of $n$, so that if (as is expected again) $g$ behaves like randomly chosen, the expected period length of $y_k$ is (only!) $O(\sqrt{p})$. This is the core idea of the algorithm, which gives it an expected running time of $O(n^{1/4+\epsilon})$ and makes it "well-suited" for small $p$ (if we forget newer methods).