Sum of products of $(1 - 1/p)$

4.7k Views Asked by At

Let $\pi(n)$ denote the number of primes not greater than $n$, and $p_k$ the $k$th prime, so that $p_{\pi(n)}$ denotes the largest prime not greater than $n$. I'm interested in the value of the following sum: $$S(n) = \left(1 - \frac12\right) + \left(1 - \frac12\right)\left(1 - \frac13\right) + \left(1 - \frac12\right)\left(1 - \frac13\right)\left(1 - \frac15\right) + \dots + \prod_{k = 1}^{\pi(n)}\left(1 - \frac1{p_k}\right)$$

The $m$th term of the series is the product of $\left(1 - \frac{1}{p}\right)$ for the first $m$ primes; there are $\pi(n)$ terms in the sum; the last term is the product of $\left(1 - \frac{1}{p}\right)$ over all primes $p \le n$. So to state it more precisely, $$S(n) = \sum_{m=1}^{\pi(n)} \prod_{k=1}^m \left(1-\frac{1}{p_k}\right)$$

Specifically, what are the asymptotics of $S(n)$ as a function of $n$?


Background: this comes up in the analysis of a naive algorithm for generating all primes up to $n$, which is to sequentially try dividing each number by all known smaller primes:

  • The most naive algorithm is for each $m \le n$, divide it (i.e. just find remainder) by each number $d \le m$, and declare $m$ prime if all those remainders are nonzero. This performs $\sim m$ arithmetic operations (finding remainders) for each $m$, so overall it would perform a number of remainder-operations proportional to $1 + 2 + \dots + m + \dots + n = \Theta(n^2)$.
  • Somewhat better is this algorithm, which for each $m \le n$, divides it by each prime $p \le m$ until a zero remainder is found: so all numbers $m$ are tried dividing by $2$, the numbers divided by $3$ would be those that survive (the odd numbers), the numbers divided by $5$ would be the numbers divisible neither by $2$ nor $3$ (thus $\left(1 - \frac12\right)\left(1 - \frac13\right)$ of the numbers), etc.
  • I'm fairly sure this would be worse than the Sieve of Eratosthenes, which does about $\left(\frac{n}{2} + \frac{n}{3} + \frac{n}{5} + \dots + \frac{n}{p_{\pi(n)}}\right) = \Theta(n \log \log n)$ arithmetic operations (none of them division operations actually).

(I'm also aware of various faster algorithms and simple ways to make this faster, but I'm interested in the analysis of this algorithm specifically.)

2

There are 2 best solutions below

6
On BEST ANSWER

From Mertens' theorem we have $$\prod_{k\leq m}\left(1-\frac{1}{p_{m}}\right)\sim\frac{1}{\log\left(p_{m}\right)e^{\gamma}} $$ and so $$S\left(n\right)\sim\frac{1}{e^{\gamma}}\sum_{m\leq\pi\left(n\right)}\frac{1}{\log\left(p_{m}\right)}. $$ Now put $x=p_{\pi\left(n\right)} $. We can write the sum as $$\sum_{m\leq\pi\left(n\right)}\frac{1}{\log\left(p_{m}\right)}=\sum_{p\leq x}\frac{1}{\log\left(p\right)} $$ and so using Abel's summation we have $$\sum_{p\leq x}\frac{1}{\log\left(p\right)}=\frac{\pi\left(x\right)}{\log\left(x\right)}+\int_{2}^{x}\frac{\pi\left(t\right)}{t\log^{2}\left(t\right)}dt $$ and since from the Prime Number Theorem (PNT) holds $$\pi\left(x\right)\sim\frac{x}{\log\left(x\right)} $$ we have $$\sum_{p\leq x}\frac{1}{\log\left(p\right)}\sim\frac{x}{\log^{2}\left(x\right)}+\int_{2}^{x}\frac{1}{\log^{3}\left(t\right)}dt $$ $$=\frac{x}{2\log^{2}\left(x\right)}+\frac{\textrm{Li}\left(x\right)}{2}-\frac{x}{2\log\left(x\right)}+\frac{\log\left(2\right)+1}{\log\left(2\right)} $$ where $\textrm{Li}\left(x\right)$ is the logarithmic integral. Note the that from PNT we have $$p_{\pi\left(n\right)}\sim\pi\left(n\right)\log\left(\pi\left(n\right)\right)\sim n\left(1-\frac{\log\left(\log\left(n\right)\right)}{\log\left(n\right)}\right).$$ I added some details. Now since holds $$\textrm{Li}\left(x\right)=\frac{x}{\log\left(x\right)}\sum_{k=0}^{n}\frac{k!}{\log^{k}\left(x\right)}+O_{n}\left(\frac{x}{\log^{n+2}\left(x\right)}\right) $$ which can be proved iterating the integration by parts we can note that $$\frac{\textrm{Li}\left(x\right)}{2}-\frac{x}{2\log\left(x\right)}=\frac{x}{2\log^{2}\left(x\right)}+O\left(\frac{x}{\log^{3}\left(x\right)}\right)\sim\frac{x}{2\log^{2}\left(x\right)} $$ so $$\sum_{p\leq x}\frac{1}{\log\left(p\right)}\sim\frac{x}{\log^{2}\left(x\right)} $$ and note that $$n\left(1-\frac{\log\left(\log\left(n\right)\right)}{\log\left(n\right)}\right)\sim n $$ so finally $$S\left(n\right)\sim\frac{n}{e^{\gamma}\log^{2}\left(n\right)}.$$

0
On

Sorry I asked the question and then forgot to come back here… I figured out the asymptotics I cared about, to the level of detail I cared about, by simply thinking about the actual algorithm that inspired this question.

To recap the question, the algorithm is the following. Given $n$, we want to find all primes $\le n$:

  • Maintain a list of known primes (initially empty).
  • For each number $r$, where $2 \le r \le n$,
    • try dividing $r$ by each known prime $p < r$ (from the above list)
      • If you encounter a zero remainder, stop (move on to the next $r$).
    • Else (if all remainders are nonzero), then append $r$ to the list of known primes, and continue.

Note that in this algorithm,

  • for every number $r$ ($2 < r \le n$), we try dividing $r$ by $2$
  • for every odd number $r$ ($3 < r \le n$), we try dividing it by $3$ as well (so $1/2$ of the numbers greater than $3$)
  • for every number $r$ not divisible by $2$ or $3$ ($5 < r \le n$), we try dividing it by $5$ as well (so a fraction $(1-1/2)(1-1/3)$ of those numbers).
  • in general, we try dividing by a prime number $p_m$ all the numbers $r > p_m$ that are not divisible by any of the first $m-1$ primes, so a fraction $\prod_{k=1}^{m-1}(1-1/p_k)$ of them.

So the number of division operations the algorithm performs is, and the running time of the algorithm is asymptotically equal to, $$\sum_{m=1}^{\pi(n)}(n-p_m)\prod_{k=1}^{m-1}\left(1-\frac{1}{p_k}\right)$$ (slightly different from the one asked in the question, but I guess it should have the same asymptotics as multiplying the one in the question by $n$ throughout).


Now an obvious upper bound on the algorithm's running time is $n\pi(n)$: we divide each number $r$ by at most $\pi(n)$ primes.

And an obvious lower bound is $0 + 1 + 2 + \dots + (\pi(n) - 1) = (\pi(n)-1)\pi(n)/2$: to determine each prime we divide it by all smaller primes.

This puts the asymptotics as being between $\Theta(\pi(n)^2)$ and $\Theta(n\pi(n))$, so between $\frac{n^2}{\log n \log n}$ and $\frac{n^2}{\log n}$. Similarly, the quantity actually asked in the question should be between $\frac{n}{\log n \log n}$ and $\frac{n}{\log n}$.