Naive Grouping for Factorization

155 Views Asked by At

I have a naive grouping method for factorization. I am curious as to its novelty and aspects of the code below that will increase its efficiency.

The method is best described with an example:

For n = 798,607 if we start with a seed group of 59, we would have 17 groups of 13,535 and 42 groups of 13,536. Neither the 17 nor the 42 will evenly parse the other grouping totals so we double the seed group. Now we have 118 groups in total, 17 groups of 6,767 and 101 groups of 6,768. The 6,767 can be distributed among the 101 groups, resulting in factors of 101 groups of 7,907.

I have found that all seed groups are odd and less than $\sqrt n \over 2$ and there exist multiple (unknown quantity) seed groups yielding factors. The seed groups (i) are only expanded until the groups are greater than the group counts, or the initial seed group splits do not carry forward (i.e., 17 or 42 groups from seed group 59 needs to be a solution in the 118 groups split.)

#Naive Grouping
NaiveGrouping <- function(number){

  r<- as.integer(sqrt(number)/2)
  i<-3

while(i<=r){
  for (j in 0:(floor(log2(number)))){   


  Seed_Group<-(i*(2^j))

  Counts <- (number/Seed_Group)

  if(Counts%%1==0){
            return(c(Seed_Group=Seed_Group,End_Group=Seed_Group,Factors=Seed_Group,number/Seed_Group))
         }


  count_ceiling <- (ceiling(Counts))
  count_floor <- (floor(Counts))

  initial_group_ceiling <- (number%%i)
  initial_group_floor <- (i - number%%i) 

  group_ceiling <- number%%Seed_Group
  group_floor <- (Seed_Group - (group_ceiling))

      if(group_floor==initial_group_floor | group_ceiling==initial_group_ceiling)    
      if(group_floor==1 | group_ceiling==1) next
      if(group_floor>count_ceiling | group_ceiling>count_floor) break


      if(count_floor%%group_ceiling==0){
               return(matrix(c(group_floor,count_floor,group_ceiling,count_ceiling,i,group_floor+group_ceiling,group_ceiling,number/group_ceiling),4,2,byrow=TRUE,dimnames=list(c("Group Floor","Group Ceiling", "Seed Group, End Group","Factors"),c("Groups","Counts")))) 

              }

      if(count_ceiling%%group_floor==0){
               return(matrix(c(group_floor,count_floor,group_ceiling,count_ceiling,i,group_floor+group_ceiling,group_floor,number/group_floor),4,2,byrow=TRUE,dimnames=list(c("Group Floor","Group Ceiling", "Seed Group, End Group","Factors"),c("Groups","Counts"))))

              }
      }
  i=i+2L
  }
return("Prime")
}

UPDATE: I have discovered a new relationship between the Naive Grouping method above and lattice points on a hyperbola which solve for factors of N.

Using the code provided the result for $N=7,084,149,637$ is:

> (NaiveGrouping(7084149637)) 
                         Groups Counts 
Group Floor                2795 82781 
Group Ceiling             82781 82782 
Seed Group, End Group     10697 85576 
Factors                   82781 85577

The Group Floor (2795) is related to a polynomial of the form $N -x^2 - yx - x = 0$. When $y=2795$ , upon examination of the polynomial and the resulting hyperbola, the only lattice points not containing 1 or N occur with the factors of N for $y=2795 , -2797$.

Here is the Wolfram Alpha link to this example. Try other N to see how this holds for other factorizations.

2795 is the Group Floor within the End Group 85,576. I achieved this from expanding the Seed Group 10,697. However, if I had tested the Seed Group to see whether it generates an integer solution to the polynomial, I would have stopped with a result after just ~ 1,400 steps since the y-coordinates are only odd.

I have updated the algorithm to search for integer solutions to the polynomial at every Seed Group using Newton's Method (detailed here), however, the relationship does not hold for every N. In those instances, the Naive Grouping will return the factors first as the associated y-coordinates are greater than $\sqrt n \over 2$ range required. Expanding the Seed Group for y-coordinate tests via multiplication at every iteration reconciles the range issue. So, it seems to be a probabilistic accompaniment to the deterministic Naive Grouping method.

I would like to know more about lattice points on hyperbolas related to factorization. Particularly if these points have a general range or maximum value relative to N. Also, how Wolfram Alpha determines them so quickly! If anyone has any insights as to why this is occurring or suggestions of related literature please feel free to comment.