Factoring an integer using its Zeckendorf representation

220 Views Asked by At

Notation:

Given an integer $N$, we find its Zeckendorf representation as the sum of non-consecutive Fibonacci numbers. By the Zeckendorf theorem, this representation is unique.

Let $\zeta_N$ denote the ordered sequence of the Fibonacci numbers in the Zeckendorf representation of $N$. The number of Fibonacci numbers (i.e., digits) in the Zeckendorf representation is $O(\log_{\phi}{N})$ where $\phi$ is the Golden Ratio.

We define the following functions on a sequence $S$:

  1. $\mu$ computes the minimum nonzero element in the sequence $S$.

$$ \begin{align} \mu(S) &= \min(x) : {x\in S,x \ne 0} \end{align} $$

  1. $\gamma$ returns the non-trivial factor of $N$ if it is contained in $S$ (or returns $1$ otherwise).

$$ \begin{align} \gamma(S) &= \begin{cases} \gcd(x,N), & \text{if $\exists x \in S$ s.t. $\gcd(x,N) \ne 1$}\\ 1, & \text{otherwise} \end{cases} \end{align} $$

  1. $\delta$ rebalances the sequence by subtracting $\mu(S_k)$ from each non-zero element in $S_k$:

$$ \begin{align} \delta(x) &= \begin{cases} 0, & \text{if $x = 0$} \\ \mu(S_{k})n, & \text{if $x = \mu(S_k), n = $ number of non-zero entries in $S_k$} \\ x - \mu(S_{k}), & \text{otherwise} \end{cases}, \\ x &\in S_{k} \end{align} $$ and

  1. we define the recurrence

$$ S_{k+1} = \langle \delta(x) : x \in S_{k} \rangle $$ We set $S_0 = \zeta_N$ and iteratively compute $S_1, S_2, \cdots, S_k$. After computing each $S_i$, we check if $\gamma(S_i) \ne 1$ and stop if that is true.


Empirical testing:

I wrote a small program that does this and tested it on small $N$. Surprisingly, it seems to find the non-trivial factor.

Here's an example:

1:  Zeckendorf(1387) =
2:   (987, 16),
3:   (377, 14),
4:   (21, 8),
5:   (2, 3)
6:   [Size: 4 terms]
7:  Sum(Zeta(N)) = 1387 = 1387 (check)
8:  After subtraction: 985, 375, 19, 8
9:  Found factors 19x73 in 1 iterations
10: Found factors 19x73 = 1387

Lines 2-5 show the Zeckendorf representation of $1387 = 987 + 377 + 21 + 2$. The pair $(987, 16)$ is of the form $(F_n, n)$.

Here's a step-by-step of what happens with the algorithm:

  1. We have $N=1387=987 + 377 + 21 + 2$.
  2. The minimal non-zero element is $2$. None of these elements have a non-trivial GCD with $N=1387$. So, we proceed to the next step.
  3. We subtract the minimal non-zero element $2$ from all non-zero elements that are not $2$ and add the total subtracted to $2$.
  4. i.e. We get $N=1387=(987 - 2) + (377 - 2) + (21 - 2) + (2 + 2 + 2 + 2) = 985 + 375 + 19 + 8$.
  5. Since $19|1387$, we report it as a factor and it took just one iteration.

Here is an example that takes 5 iterations to find the factor:

Zeckendorf(1216669) =
  (832040, 30),
  (317811, 28),
  (46368, 24),
  (17711, 22),
  (2584, 18),
  (144, 12),
  (8, 6),
  (3, 4)
  [Size: 8 terms]
Sum(Zeta(N)) = 1216669 = 1216669 (check)
After iter #0: 832040, 317811, 46368, 17711, 2584, 144, 8, 3
After iter #1: 832037, 317808, 46365, 17708, 2581, 141, 5, 24
After iter #2: 832032, 317803, 46360, 17703, 2576, 136, 40, 19
After iter #3: 832013, 317784, 46341, 17684, 2557, 117, 21, 152
After iter #4: 831992, 317763, 46320, 17663, 2536, 96, 168, 131
Found factors in 1039x1171 in 5 iterations
Found factors 1039x1171 = 1216669

We can check that $GCD(17663, 1216669) = 1039$.

It is not short like the example with $N=1387$ and shows how each iteration progresses.


I ran the program on $625$ products of those primes ($25$ primes greater than $1000$) and in each instance the factors were found. Some more examples:

Zeckendorf(1018081) = 
  (832040, 30),
  (121393, 26),
  (46368, 24),
  (17711, 22),
  (377, 14),
  (144, 12),
  (34, 9),
  (13, 7),
  (1, 2)
  [Size: 9 terms]
Sum(Zeta(N)) = 1018081 = 1018081 (check)
Found factors in 1009x1009 in 15 iterations
Found factors 1009x1009 = 1018081
Zeckendorf(1022117) = 
  (832040, 30),
  (121393, 26),
  (46368, 24),
  (17711, 22),
  (4181, 19),
  (377, 14),
  (34, 9),
  (13, 7)
  [Size: 8 terms]
Sum(Zeta(N)) = 1022117 = 1022117 (check)
Found factors in 1013x1009 in 66 iterations
Found factors 1013x1009 = 1022117
Zeckendorf(1028171) = 
  (832040, 30),
  (121393, 26),
  (46368, 24),
  (17711, 22),
  (6765, 20),
  (2584, 18),
  (987, 16),
  (233, 13),
  (89, 11),
  (1, 2)
  [Size: 10 terms]
Sum(Zeta(N)) = 1028171 = 1028171 (check)
Found factors in 1009x1019 in 112 iterations
Found factors 1009x1019 = 1028171
Zeckendorf(1030189) = 
  (832040, 30),
  (196418, 27),
  (1597, 17),
  (89, 11),
  (34, 9),
  (8, 6),
  (3, 4)
  [Size: 7 terms]
Sum(Zeta(N)) = 1030189 = 1030189 (check)
Found factors in 1021x1009 in 26 iterations
Found factors 1021x1009 = 1030189

This is a scatterplot of the number of iterations to find the factor vs. $N$.

Scatterplot of iterations vs. N

Observations:

  1. Here are some of the statistics from the run:

$$ \begin{matrix} \text{Count:} & 625 \\ \text{Minimum:} & 1 \\ \text{Maximum:} & 658 \\ \text{Average:} & 82.5264 \\ \text{Stdev:} & 90.3744 \end{matrix} $$

  1. Statistics broken down by whether $N$ is a square or not. Interestingly, squares seem to have a lot of variance and the average is also $\gt 2\times$ the non-squares. Fortunately, it is easy to check for a square before attempting factorization. So, I am not concerned about this.

Stats breakdown by N being square or not

  1. The fact that the scatterplot shows some points (asymptotically) closer to $\sqrt{N}$ means that this may still have worst case behavior in $O(\sqrt{N})$ i.e., no better than trial division.

  2. Scatter plot of $n$, number of iterations vs. $d$, number of digits in Fibonacci square base (See updates below)

Scatter plot of n, number of iterations vs. d, number of digits in Fibonacci square base


Updates:

  1. (July 29, 2023): A straightforward optimization that we can do is compute $\mu(S)$ and $\theta(S)$ where $\theta(S) = \mu(S\backslash \{\mu(S)\})$ i.e., the second smallest element in $S$. We can then subtract the smallest multiple of $\mu(S)$ from $\theta(S)$ so that

$$ \theta(S) - \alpha \cdot \mu(S) \lt \alpha \cdot \mu(S) \\ \implies \theta(S) \lt 2 \alpha \cdot \mu(S) \\ \implies \alpha \gt \frac{\theta(S)}{2\mu(S)} $$

(Update after testing): This doesn't necessarily work as it also skips over the GCD computation on the sequence elements resulting in sub-optimal number of iterations.

  1. (July 29, 2023): An exciting update. The method seems to also work with weighted Fibonacci squares decomposition i.e.,

$$N = \sum (F_{c_i}^{2} \cdot w_i) \text{ where $c_i, w_i, i \in \mathbb{Z}$ }$$.

I had a related MSE question on weighted Fibonacci squares decomposition here. The good news is every positive integer has a weighted Fibonacci squares sum decomposition.

In fact, it performs better, empirically on the same test data.

Scatter plot of iterations for Fibonacci squares decomposition

Here is the comparison summary between using the Zeckendorf representation versus the weighted Fibonacci squares representation:

$$ \begin{matrix} \text{Squares better?} & \text{Count} & \text{pct. of total} \newline \text{TRUE} & 382 & 61\% \newline \text{FALSE} & 243 & 39\% \newline \text{Total} & 625 & 100\% \end{matrix} $$

On 61% of cases Fibonacci squares do better and on the remaining the Fibonacci numbers do better. The average and standard deviation is also markedly better.

  1. (July 29, 2023): Another interesting update. If we use the partition of $N$ as the sum of powers of $2$, then the algorithm works. I am going to conjecture that any base $b$ would work. Perhaps, there is a base $b$ dependent on $N$ that is optimal for factoring $N$ using this algorithm (also a conjecture).

Scatter plot of n vs. d for powers of 2 base

Questions:

  1. Why does this algorithm even work? Can we show a proof for why it should work (or not)?
  2. If this does work, are there any ideas for improving the runtime complexity?