Is this way of factorizing an integer faster than trial division?

262 Views Asked by At

I recently found a way to quickly find a factor of a given integer, starting from another integer not too far away from that factor and I was wondering whether this way is faster than trial division. It all depends on a simple lemma.

Lemma: Let $n,d\in\mathbb{N}$ with $d<n$, $d\mid n$ and let $k\in\mathbb{Z}_{\ge 0}$ with $$k<\frac{d^2}{n-d}.$$ Let $r$ be the least positive integer congruent to $-n$ modulo $d+k$. Now, either $\gcd(n,d+k)$, $\gcd(n,r)$ or $2$ is a non-trivial divisor of $n$.

Proof: We have: $$(d+k)\frac nd=n+\frac{nk}{d}\implies -n\equiv \frac{nk}{d}\pmod {d+k}$$ Also: \begin{align*} k &< \frac{d^2}{n-d}\\ nk-dk &< d^2\\ nk &< d^2+dk\\ \frac{nk}{d} &< d+k \end{align*} from which it follows that $r=\frac{nk}{d}$, so either $\gcd(r,n)$ is a non-trivial divisor of $n$ or $d\mid k$, in which case $\gcd(n,d+k)$ is a non-trivial divisor of $n$, except if $d=k=n/2$, in which case we have $2\mid n$. Q.E.D.

Now, In order to find a $d$, we just have to find $d+k$, so instead of finding one value ($d$), we need to find one of the $d^2/(n-d)$ values of $d+k$. We've nog longer got a single needle in a haystack, but a whole bunch of them. And we only need to find one.

Questions

  • First, did I make any mistakes?
  • If not, how long would it take to find a single non-trivial factor of a given $n$?
  • And how long would complete prime factorization take?
  • Is this faster than trial division (trivial if you've got the answer to the above questions)

Edit: From @mixedmath's answer, we know that we can ignore the time complexity of computing the $\gcd$ when calculating the time complexity of finding a single non-trivial factor. Let $$S(n):=\sum_{d\mid n, d\neq n}\left\lfloor\frac{d^2}{n-d}\right\rfloor$$ This is how many values of $d+k$ will give us a non-trivial divisor of $n$. When randomly guessing a value of $d+k$, the chance of succes is: $$\frac{S(n)}{n}$$ which means the chance of failure is $1-S(n)/n$. This means the time complexity should be: \begin{align*} &O(\log_{1-S(n)/n}0.5)\\ &= O\left(\frac{-\log 2}{\log(1-S(n)/n)}\right)\\ &= O\left(\left(\log\left(1-\frac{S(n)}{n}\right)\right)^{-1}\right) \end{align*}

1

There are 1 best solutions below

1
On

Let's try your method on factoring $17\cdot 19 = 323$. There are two factors to identify.

Let's suppose that we were to identify the factor $d = 17$. What range of $k$ suffices? You say we need $$ 0 \leq k < \frac{d^2}{n-d} = \frac{17^2}{323 - 17} = \frac{289}{306} = 0.944...$$ This only leaves $k = 0$. So to guess $d$, one would need to guess exactly the factor $17$. This saves no time.

Perhaps the other factor is more helpful. Let's suppose we were to identify $d = 19$. Then $$ 0 \leq k < \frac{19^2}{323 - 19} = \frac{361}{304} = 1.1875.$$ So we would need to guess $d+k$ as either $19$ or $20$. This is not giving much space.

Of the three possible guesses for $d+k$, the only one that isn't a factor of $n$ directly (i.e. the only one that your method proposes that would be "another needle in the haystack") is $20$. Let's check that your method on $20$ works.

We compute $\gcd(n, d+k) = \gcd(323, 20) = 1$.

You say $r \equiv -n \pmod{d+k}$, and is chosen as the least positive residue. Here, that means $r = 17$, and $\gcd(n, r) = \gcd(323, 17) = 17$. So this identifies $17$ as a divisor.


The good news is that it does appear that your method does appear to work. The bad news is that in cases where $n$ is a product of two primes that are relatively near each other in size (the classic challenge), your method does not do very well and adds almost nothing to trial division. Even if the numbers are larger, you often gain nothing.

Your method works best when $n$ contains many factors. In particular, your method is best at finding the smallest nontrivial factor (corresponding to the largest nontrivial $d$, which has the largest $k$ range). When $n$ has many factors, you identify much more rapidly than random-guess trial division. Once you've identified the small factors, the method becomes worse and worse at identifying larger factors.


As a final note, you ask about the time complexity of computing gcds. Through the Euclidean algorithm, one can compute gcds in logarithmic time. You can pretend that this is instant, as gcds will not be what holds the computation back.

Computing gcds is so fast that many implementations of factorization algorithms first compute the gcd of $n$ with the product of the first very many primes, just to get all the small factors out of the way.