An improved(?) expression for sieving intervals for prime numbers

44 Views Asked by At

I was looking at this question which postulates the existence of at least one prime number between the squares of successive primes. One of the complicating issues in addressing that conjecture as stated is the variability of the gap between successive prime numbers. I thought a fixed gap might render the problem more tractable. I look at open intervals of integers $(p_i^2,\dots,(p_i+6)^2)$ for prime numbers $p_i\ge 5$, i.e those having the form $6k\pm 1$. Such intervals contain $12p_i+35$ consecutive integers. Empirically, all such readily observable intervals contain prime numbers, but that is not a proof that every such interval contains at least one prime number.

The posited intervals have the following useful properties: Each prime number $p_j < p_i+6$ will divide multiple numbers contained within the interval, and the interval contains no numbers whose only prime factors are $\ge p_i+6$. Therefore, if there was a suitable sieve to eliminate composite numbers in the interval, there would be no need to employ prime numbers $\ge p_i+6$ in the sieve, nor would prime numbers $< p_i+6$ fail to identify for elimination multiple composite numbers in the interval. For this chosen form of the interval, after eliminating all numbers having prime factors $< p_i+6$, any remaining numbers would be prime.

I began by thinking about inclusion/exclusion, which employs a factor of the form $\prod\left ( 1-\frac{1}{p}\right )$. This works well when the primes included in the product are divisors of a given number $n$, and the interval is $(1,\dots,n)$; this is simply Euler's totient function which reports the number of integers coprime to $n$ in that interval. As a first approximation, I looked at applying this product to the intervals I was interested in, including all prime numbers $<p_i+6$. If we represent the number of integers in the interval associated with $p_i$ as $r_i=12p_i+35$, we can posit that the number of primes in that interval will be approximated by: $$r_i\prod_{2\le p_j < p_i+6}\left ( 1-\frac{1}{p_j}\right )$$ The approximations yielded by this expression uniformly overestimate the number of primes that actually exist within the intervals. I believe that this occurs because for each sieving prime (particularly those that are not an even divisor of the interval size), the number of composite numbers it identifies for removal is not $\frac{r_i}{p}$, but $\left \lfloor \frac{r_i}{p}\right \rfloor $

Accordingly, I formulated the following expression as an approximation of the number of primes in the intervals I am interested in: $$r_i\prod_{2\le p_j < p_i+6}\left ( 1-\frac{\left \lfloor \frac{r_i}{p_j}\right \rfloor }{r_i} \right)$$ The accurate number of composite numbers identified bfor removal, divided by the whole number of integers in the interval, gives a more accurate estimate of the fraction of numbers being removed. This appears to afford a much better approximation of the number of primes in the intervals, for the limited number of examples I have been able to check by manual computation.

I may not be the first person to have posited such an expression, and its accuracy when extended to a large number of cases may already have been investigated by persons with access to greater computational resources than I have at my disposal. My main questions are: Is the expression I have formulated already known? And if so, has its accuracy been determined? Secondarily, I am curious as to whether the specific intervals I am working with have any influence on the accuracy of the expression. Answers to these questions may indicate whether it would be prudent to conjecture that the intervals I am looking at always contain at least one prime number.