I'm wondering something about the application of Pollard's p-1 method. A step in it consists of:
$$ \prod_{\text{primes } q \leq B}q^{\log_qB}$$
Specifically I'm focusing on the part $primes\ q \leq B$ which I understand to mean all primes up to $B$ (where $B$ is an arbitrarily chosen value). Since finding primes is a non-trivial task, what's the point of the $p-1$ method? If you're already generating a list of prime numbers, why not just use trial division? Wouldn't it be simpler and faster? I guess the bound can be varied and in some circumstances you can get lucky. Is $p-1$ only efficient if you somehow already know the primes up to $B$?
For a selected $B_1$ and $B_2$ bound, the primes $p$ that you can factor are $B_1$-smooth (all prime factors of $p-1$ below $B_1$) and $B_1,B_2$ 1-semismooth (all prime factors of $p-1$ below $B_1$ except for 1 in $[B_1,B_2]$). Not exactly, but more or less this.
For trial division, you'd need to do $m$ divides for $m$ primes.
However for $p-1$, as you consider over a larger number of primes to test, the number of operations become less since the factorization of $p-1$ will coincide. For example if $$ \begin{align*} p_1-1 &= 2*3*q\\ p_2-1 &= 2*5*q \end{align*} $$ Then it suffices to do only 1 computation: $$ a^{2*3*5*q}\pmod N $$ that is not much more difficult than the original 2.
This collision of factorizations is the main property that makes it fast, though not the only one.
To give you some figures, you can roughly find about 30% of the primes in $[2^{31},2^{32}]$ using $B_1 = 260, B_2=11600$.
Edit: This was for another method, the actual value might be 5% to 10% instead.
(Notice that this implies that 30% of the primes in that range has $p-1$ factors to $B_1$ smooth or $B_1,B_2$ 1-semismooth numbers. Even though they are 32-bit numbers, their $p-1$ factorizations are often fairly small.)
For trial division you would need $$ 0.30 * (\pi(2^{32}) - \pi(2^{31}))\approx 3e7 $$ or 30 million divisions to trial-divide those primes, where $\pi$ is the prime counting function.
For $p-1$, stage 1 is faster or same magnitude than stage 2 so we just look at the latter and multiply by 2. In stage 1 you compute $$ b = a^M\pmod N $$ In stage 2 you more or less compute $$ b^{q_i}\pmod N $$ for each prime $B_1 < q_i < B_2$. Now in $[B_1,B_2]$ there are only $1341$ primes and there are ways to make each $$ b^{q_i} \pmod N $$ cost roughly only 1 multiply-mod $N$, which is faster than 2 divisions.
Therefore the upper bound of the cost here is only about $$ 2*1341*2 = 5364 $$ divisions (2 divisions per prime, 1341 primes, times 2 to include stage 1).
As you can see, this method is a few thousand times faster than trial division.
Edit: in the case of 5% to 10% instead of 30%, this ratio drops by about 3 to 6 times, but still large.