Can factoring with the sum of 4 squares be made more efficient?

777 Views Asked by At

We have seen that it was possible to use the sum of two squares to factor numbers (see Can the sum of two squares be used to factor large numbers? )

The main drawback is the fact that the method cannot factor numbers of the form $N=(4k_1-1)*(4k_2-1)$. Since every integer can be expressed as a sum of 4 squares, the method should be able to handle all numbers.

We start with the following example: $N=7*13=91$. We calculate its sum of 4 squares representation (4sq rep) $N=a^2+b^2+c^2+d^2$ and we get:

$$N=91=(5,5,5,4),(5,7,4,1),(5,8,1,1),(8,3,3,3),(9,3,1,0)$$

The idea is to look for the lower factor as one possible square or a combination of squares $c=(a^2+b^2+...)$ or a mixed combination $c=(a^2+b+...)$ or just $c=(a+b+...)$ that, when added together, will provide a number which shares a factor with the number $N$ we want to factor. We then take the $gcd(N,c)$ to see if we have a factor.

The second 4sq rep provides the square of the factor $7^2$ but also $5^2+4^2+1^2=42=7*6$. The third representation provides an example of factor by adding $a+b=5+8=13$. The forth representation provides an example of a mixed mode $8^2+3+3=70=7*10$. The last one provides an example of $(a+b+c)=9+3+1=13$, the larger factor.

Some of the problems are:
1-for small numbers with not many 4sq rep's, it doesn't take a lot of time to calculate them and check the different combinations to find a factor.
2-for large numbers, it takes a long time to calculate all these 4 sq rep's.
3-Checking all the combinations also can take a lot of time.
4-good representations are not always among the first to be calculated and tested. There is no known method to change the order in which useful representations appear.

One way to speed up the process is to calculate one representation at a time, test it and if no factors are found, calculate another one and test it.
Another way is to calculate just one representation and, if no factor is found, calculate the 4sq rep of the squares of the first representation. The number of combinations to check can rise quickly when we go from $4$ squares to $16$ squares. For small numbers, it's probably better to calculate the next 4sq rep than to expand the squares of the first 4sq rep.
A faster way to find a factor for small numbers is to expand the squares into sums of triangular numbers and try to find a combination that can lead to a factor. I have not tested the decomposition of squares in triangular numbers with large numbers.

There are probably other ways to speed up the process but I can't seem to find them.

The question is: Can this method be made efficient?

The other, more fundamental question, is which sum of squares is better suited to factoring large numbers, $4$, $6$, $8$...?

2

There are 2 best solutions below

4
On

So I was interested in a similar factorization algorithm, so I'll just give you my feedback.

First of all, generating all sum of four square representations (SFSRs) is extremely computationally expensive. The total number of representations for a number N is $$24\sum_{d|N, odd} d$$which grows between $\Omega (N)$ and $O(N\log\log N)$; with that many steps, you may as well do trial division.

Second, there doesn't seem to be any reason for your value $c$ to have a nontrivial $\gcd$ with $N$i.e. most of the time, $\gcd(N,c)=1$. Hoping that perhaps one such way of generating $c$ results in a nontrivial $\gcd$, the runtime is then dependent on how many positive integers have a nontrivial $\gcd$ with $N$. If $N=pq$ for distinct primes $p$ & $q$, then this number would just be $p+q$, meaning if you consider your $c$ to be randomly chosen from $1$ to $N$, you would have roughly a $1/\sqrt{N}$ chance of landing on something not coprime to $N$.

Third, this technique doesn't seem at all reminiscent to the two square factorization method you mentioned. Euler's method was to reverse the Fermat-Brahmagupta identity, which ended up being as simple as calculating a couple $\gcd$s. An analogous method for four squares would be to reverse Euler's Four-Square Identity.

The algorithm I was interested in did something similar, but it restricted how many SFSRs it generated, then the idea was to combine them to solve for the squares of the factors' SFSRs. It doesn't work for a couple of reasons, but perhaps the method was a little better.

7
On

I'm still not entirely sure what you're doing. There's certainly nothing preventing us from adding more squares or combining the terms with the hope of finding a factor, but I don't see any theoretical reasons why it should be any more likely to find a factor than randomly guessing, and if you look at the probability of landing on a number with nontrivial gcd with $N$, it's absurdly unlikely.

If you instead try to reverse Euler's Four-Square Identity, then it at least becomes theoretically clear why it should be more likely. Nonetheless, I've never been able to make it work