What are Green's almost primes?

357 Views Asked by At

In a general-audience talk, Ben Green explains his famous proof with Terence Tao as an application of Szemerédi's theorem, but placing the primes within a smaller set of almost-primes in which they have relative density 50%. This may have been a simplification -- all that is needed is a positive lower density.

He shows a slide around 44:30 in the video which gives the primes together with these almost primes, which seem to be:

35, 55, 65, 77, 85, 91, 95, 115, 119, 133, 143, 145, 155, 161, 185, 187, 203, 205, ...

Now he claims that these are "the same numbers that appeared in Chen's theorem", that is, products of two primes, but that doesn't work. For one, many semiprimes are missing from the above list. For another, the primes have relative density 0 in the set of numbers with at most 2 prime factors.

So what is this set? I tried to find an explanation in his 2004 paper, but as far as I can tell there is no sequence there, and the relevant portion (Proposition 9.1 if I am not mistaken) is just an existence result on relevant measures, not a concrete measure where at least one could find the numbers which have 'high' weight.

I looked it up in the On-Line Encyclopedia of Integer Sequences and found no results. By omitting terms I could find some hits, special subsequences of odd semiprimes, but nothing promising. (I also tried adding in the primes and searching; no luck.) Even the superseeker failed.

Any ideas?

2

There are 2 best solutions below

0
On BEST ANSWER

The following is a slight oversimplification, but I think it answers the heart of your question: have a look at Definition 9.2 in Green-Tao:

$$\Lambda_R(n) := \sum_{\substack{d\mid n\\d \le R}} \mu(d) \log(R/d).$$

Also note in Definition 9.3 that $R$ is taken as a small positive power $N^{c_k}$, where $[1,N]$ is (more or less) the interval in which one is sieving for arithmetic progressions.

The function $\Lambda^2_R(n)$, gets stretched and chopped in a few ways by Definition 9.3, but it's essentially the concrete measure you're looking for. Notice that when the smallest prime factor of $n$ is at least $R$, then $\Lambda_R(n) = \log R$: these turn out to be (the bulk of) the numbers for which the measure is spikiest. By contrast, if $n$ has some prime factors that are smaller than $R$, then we'll get some cancellation from the sign changes of $\mu$.

While it's theoretically conceivable that some highly composite values of $n$ yield sums which conspire to larger values of $\Lambda_R$, this is just not something that happens often enough to matter. Indeed, the average magnitude of $\Lambda_R$ over a suitably long interval is just $O(1)$, and the average value of $\Lambda_R^2$ is $O(\log R)$ (the point of Section 10 is to prove a much stronger form of this), so that the contribution from just the primes in $[R,N]$ has positive density with respect to the measure.

Although Chen's theorem is usually stated in the simple language of semiprimes, the proof actually yields something quite a bit stronger (see for instance p.483 of Opera de Cribro): there are infinitely many primes $p$ such that $p-2$ has at $\le 2$ prime divisors, each of which is at least $p^{1/8}$.

To make sense of Green's comparison to Chen primes, I think it's reasonable in light of the previous paragraph to approximate "semiprimes" by "numbers $n$ whose prime factors are all larger than $n^{c_k}$", a condition that is in some ways weaker (it allows more than 2 prime factors) and in some ways stronger (they are restricted in size). Most significantly, this addresses your concern about the cardinality of semiprimes. While you are of course correct that there are $N \log \log N / \log N$ semiprimes up to $N$, this does not apply to the second set which is only of size $O(N/\log N)$ (see Buchstab's theorem) and thus comparable to the primes themselves.

5
On

The sequence given (I did not cross-check with the presentation) seems to consist of numbers which are product of two distinct primes, both greater than $3$.