Space complexity of the segmented sieve of Eratosthenes

1.1k Views Asked by At

It's commonplace to say that without compromising on the time complexity of $O(n\log\log n)$, the space complexity of the sieve of Eratosthenes can be reduced to $O(\sqrt{N})$ using a segmented version of the sieve.

This is true, however we can do slightly better for the space complexity. The problem is that different forms for the space complexity appear all over the place. In Crandall and Pomerance we have "if the length of a segment drops below $\sqrt{N}$, the efficiency of the sieve begins to deteriorate." [Crandall and Pomerance, 2nd ed., pp. 121]. To me this implies a $O(\sqrt{N})$ memory requirement. Primesieve claims a space complexity of $O(\sqrt{N})$. On Wikipedia and also here it's claimed that the space complexity is $O(\sqrt{N}\log\log N/\log N )$.

I think part of the problem is that people don't clearly distinguish between space complexity in the sense of the number of bits that need to be stored, as compared to the number of numbers, each having multiple bits, that need to be stored.

I worked out the complexity myself and am posting it as an answer to my own question in case (1) someone finds a mistake or is helpful enough to comment on my answer or (2) someone else finds this useful.

1

There are 1 best solutions below

0
On BEST ANSWER

Suppose that we want to sieve up to $N$ using blocks of length $M$ in a segmented sieve of Eratosthenes. The number of terms to sieve one block is proportional to: \begin{align} t_M &= \sum_{p\leq \sqrt{N}}\left[ 1 + \frac{M}{p} \right]\\ &= \pi(\sqrt{N}) + M\log\log N + O(M) \end{align} where the '$1+$' terms come from performing at least one operation per prime $p$ (e.g. a modulus, or checking if at least one multiple of $p$ falls in the block), and the $M/p$ terms come from sieving the block with each prime $p$. This gives a total time complexity to sieve all $N/M$ blocks as: \begin{align} t_N &= \frac{N}{M} \left[ \pi(\sqrt{N}) + M\log\log N + O(M) \right]\\ &= \frac{ 2N^{3/2}}{M \log N} + N\log\log N + O(N), \end{align} where we have used the prime number theorem on $\pi(\sqrt{N})$ and neglected the resulting error terms (one can include them; they're negligible for our purposes). If we choose $$ M = \frac{\sqrt{N}}{ \log{}N\log\log{}N }, $$ then this gives $t_N \in O(n\log\log n)$, which is the desired time complexity.

Besides $M$, the only other place nontrivial amounts of memory are used is in the list of sieving primes, which is composed of $\pi( \sqrt{N} ) \in O\left( \frac{\sqrt{N}}{\log(N)} \right)$ numbers, each having $O(\log N)$ digits. Thus, the space complexity of the segmented sieve of Eratosthenes is: $$ S = O\left( \frac{\sqrt{N}}{ (\log N)^2\log\log N } + \frac{\sqrt{N}}{\log N} \right) = O\left( \frac{\sqrt{N}}{\log N} \right) $$ in terms of a number of numbers each of width $\log N$ that need to be stored, or $$ S_B = O\left( \frac{\sqrt{N}}{ \log N\log\log N } + \frac{\sqrt{N}}{\log N}\log N \right) = O\left( \sqrt{N} \right) $$ in terms of the number of bits of storage that are needed.