I'm doing a project on Pollard's Rho algorithm, and after lots of readings, I still am very confused about this one part.
I understand why there's a pseudorandom sequence, and I understand why it must cycle. I also understand why we are trying to find pairs of numbers to check their difference and whether they have a GCD with the N = pq in question.
Why are we actively looking for cycles though? Why does looking for cycles help here? I know they are inevitable in a sequence through polynomial modulo N, but why is it better to find the cycle quicker through Floyd's algorithm?
Also, if you're running Floyd's and $x = g(x)$ and $y = g(g(y))$, is the point where $y_i$ and $y_j$ equal each other (a.k.a. the hare finishing its cycle) where $| x_i - x_j |$ contains $p$? If so, why?
If someone could explain and also provide an example to any of these questions, that would be helpful. I know there is a similar question on this topic with an answer, but it's not exactly what I'm looking for, and it's not helping my understanding. It's possible I am misunderstanding what the articles are saying about Floyd's algorithm though.
Let's say we're trying to factor $n = p \cdot q$
Consider a deterministic pseudorandom sequence of integers $x_{k}$. Finding a cycle modulo $p$ within this sequence means that we'll have $i$ and $j$ such that $x_{i} \equiv x_{j} \pmod p$, which implies that $d = \vert x_{i} - x_{j} \vert$ is a multiple of $p$. Then we can take $\gcd(d, n)$ and find $p$, because both $n$ and $d$ are multiples of $p$. We don't precisely know the values $x_{k} \pmod p$ at any point, but we know that if we do find a cycle over $p$, the $\gcd$ operation will return $p$. This is how the algorithm finds cycles - it simply computes the $\gcd$ of $n$ and the difference of two values generated by the sequence - if a cycle is found, a non-trivial factor will be returned by $gcd$.
Let's walk through a simple example by generating a pseudorandom sequence, without applying Floyd's algorithm. Consider $n = 32881 = 131 \cdot 251$. Let's consider the pseudorandom sequence generated by:
So we have $x_{11} \equiv x_{4} \equiv 92 \pmod p$. But since we don't know the value of $p$, we couldn't have calculated $92$ explicitly, nor matched the values $x_{11}$ and $x_{4}$. However, we do know that if their values modulo $p$ match, the $\gcd$ operation will return a non-trivial factor of $n$. Indeed, calculating $\gcd(30877 - 16991, 32881) = 131$ yields $p$.
When used in Pollard's Rho, Floyd's algorithm is better than linear cycle finding and some other algorithms for two reasons:
Unless I'm misunderstanding what you asked, this is not necessarily true. You can experiment and find a counterexample relatively easily.