What was the heuristics behind Gauss's guess on the asymptotic distribution of k-almost primes?

194 Views Asked by At

My question refers to Gauss's conjecture on the number $\pi_{k}(n)$ of k-almost primes less then a given number $n$. Gauss conjectured that: $$\pi_{k}(n) \sim \biggl(\frac {{n}}{{\log (n)}}\biggr)\frac {{(\log (\log (n)))^{k - 1}}}{{(k - 1)!}}\,.$$

To be more precise historically, in Gauss's writing appear only the cases $k=1$ (the prime number theorem), $k = 2$ and $k = 3$, but the general form of the conjecture can be understood from these cases as well.

I found it very difficult to understand how Gauss came up with this. In contrast to the prime number theorem, where he might have conjectured it on the basis of empirical numeric tables, the complicated form of the conjecture in the cases $k= 2$ and $k = 3$ is much more difficult to guess empirically. So I think he might have had a kind of heuristic method to come up with this conjecture.

Something is not clear historically about Gauss's asymptotic laws in number theory. So if anybody has a clue, or even an example of an heuristic that leads to this conjecture, I'll be glad to hear.

1

There are 1 best solutions below

1
On BEST ANSWER

Once you know or have (correctly) guessed that $\pi(x) \sim \frac{x}{\log x}$, you don't need heuristics or guesses to come up with the asymptotic behaviour of the $k$-almost primes. You can compute that from the asymptotic behaviour of $\pi$.

Let's look at the semiprimes first - unsurprisingly, the simplest case. If $n \leqslant x$ is a semiprime, then (at least) one prime factor of $n$ must be $\leqslant \sqrt{x}$. Hence

$$\sum_{p \leqslant \sqrt{x}} \pi\biggl(\frac{x}{p}\biggr) \tag{$\ast$}$$

counts the semiprimes not exceeding $x$. Not exactly, since semiprimes both of whose prime factors are $\leqslant \sqrt{x}$ are counted twice. Subtract the count of those. That's $\frac{1}{2} \pi(\sqrt{x})\bigl(\pi(\sqrt{x} \pm 1\bigr)$, the sign depending on whether $\pi_2(x)$ shall count only squarefree semiprimes or also the squares of primes. The difference is obviously asymptotically negligible. The overcounting is $O(\pi(\sqrt{x})^2) = O\bigl(\frac{x}{(\log x)^2}\bigr)$, so just from looking at the even semiprimes we can conclude that it's asymptotically irrelevant.

Hence it suffices to compute $(\ast)$ using $\pi(x) \sim \frac{x}{\log x}$. Since the arguments of $\pi$ in $(\ast)$ are all $\geqslant \sqrt{x}$, it's easy to see that

$$\sum_{p \leqslant \sqrt{x}} \pi\biggl(\frac{x}{p}\biggr) \sim \sum_{p \leqslant \sqrt{x}} \frac{x}{p\log \frac{x}{p}} = \frac{x}{\log x} \sum_{p \leqslant \sqrt{x}} \frac{1}{p\bigl(1 - \frac{\log p}{\log x}\bigr)}\,.$$

From the (assumed or proven) asymptotics of $\pi(x)$, via summation by parts one obtains the relations

$$\sum_{p \leqslant y} \frac{\log p}{p} \sim \log y \qquad\text{and}\qquad \sum_{p \leqslant y} \frac{1}{p} \sim \log \log y,$$

and together with

$$1 < \frac{1}{1-y} \leqslant 1 + 2y$$

for $0 < y \leqslant \frac{1}{2}$ we find

$$\sum_{p \leqslant \sqrt{x}} \frac{1}{p} < \sum_{p \leqslant \sqrt{x}} \frac{1}{p\bigl(1-\frac{\log p}{\log x}\bigr)} \leqslant \sum_{p \leqslant \sqrt{x}} \frac{1}{p} + \frac{2}{\log x}\sum_{p \leqslant \sqrt{x}} \frac{\log p}{p}\,.$$

Hence

$$\sum_{p \leqslant \sqrt{x}} \frac{1}{p}\Biggl(\frac{1}{1-\frac{\log p}{\log x}} - 1\Biggr)$$

remains bounded, and

$$\pi_2(x) \sim \sum_{p \leqslant \sqrt{x}} \pi\biggl(\frac{x}{p}\biggr) \sim \frac{x}{\log x} \log \log \sqrt{x} \sim \frac{x}{\log x}\log \log x\,.$$

For $3$-almost primes, we count similarly.

$$\sum_{p \leqslant \sqrt[3]{x}} \pi_2\biggl(\frac{x}{p}\biggr)$$

counts the products $\leqslant x$ of three primes having a prime factor $\leqslant \sqrt[3]{x}$. Of course these are all $3$-almost primes not exceeding $x$. But the $3$-almost primes having two prime factors $\leqslant \sqrt[3]{x}$ are counted twice, so we must subtract their count, and arrive at

$$\sum_{p \leqslant \sqrt[3]{x}} \pi_2\biggl(\frac{x}{p}\biggr) - \sum_{p \leqslant q \leqslant \sqrt[3]{x}} \pi\biggl(\frac{x}{pq}\biggr)\,.$$

But those with all three factors $\leqslant \sqrt[3]{x}$ now have been added thrice and subtracted thrice, so that count has to be re-added. But their count is $O(\pi(\sqrt[3]{x})^3) = O\bigl(\frac{x}{(\log x)^3}\bigr)$, hence asymptotically insignificant. (The "thrice added and thrice subtracted" isn't correct for numbers with a repeated prime factor, but looking at those shows their number is negligible.) Using the obtained asymptotics for $\pi_2$, we calculate

$$\sum_{p \leqslant \sqrt[3]{x}} \pi_2\biggl(\frac{x}{p}\biggr) \sim \frac{x(\log \log x)^2}{\log x}$$

in a similar way as above (the $\log \log \frac{x}{p} = \log \log x + \log \bigl(1 - \frac{\log p}{\log x}\bigr)$ factor causes a bit of extra work but no difficulty), and - omitting some details to justify ignoring some asymptotically negligible terms - we obtain

\begin{align} \sum_{p\leqslant q \leqslant \sqrt[3]{x}} \pi\biggl(\frac{x}{pq}\biggr) &\sim \frac{x}{\log x}\sum_{p,q} \frac{1}{pq} \\ &= \frac{x}{\log x} \sum_{q \leqslant \sqrt[3]{x}} \frac{1}{q}\sum_{p \leqslant q} \frac{1}{p} \\ &\sim \frac{x}{\log x} \sum_{q\leqslant \sqrt[3]{x}} \frac{\log \log q}{q} \\ &\sim \frac{x}{\log x}\cdot \frac{(\log \log \sqrt[3]{x})^2}{2} \\ &\sim \frac{x(\log \log x)^2}{2\log x}\,, \end{align}

using

$$\sum_{p \leqslant y} \frac{(\log \log p)^k}{p} \sim \frac{(\log \log y)^{k+1}}{k+1}$$

which also follows from $\pi(x) \sim \frac{x}{\log x}$. Subtraction yields $\pi_3(x) \sim \frac{x(\log \log x)^2}{2\log x}$.

Generally, the inclusion-exclusion principle gives us

$$\pi_k(x) = \sum_{r = 1}^{k-1} (-1)^{r-1} \sum_{p_1 \leqslant \dotsc \leqslant p_r \leqslant \sqrt[k]{x}} \pi_{k-r}\biggl(\frac{x}{p_1\cdot \dotsc\cdot p_r}\biggr) + O\biggl(\frac{x}{(\log x)^k}\biggr)\,,$$

and one finds \begin{align} \sum_{p_1 \leqslant \dotsc \leqslant p_r \leqslant \sqrt[k]{x}} \pi_{k-r}\biggl(\frac{x}{p_1\cdot \dotsc\cdot p_r}\biggr) &\sim \frac{x(\log \log x)^{k-r-1}}{(k-r-1)!\,\log x}\sum_{p_1 \leqslant \dotsc \leqslant p_r \leqslant \sqrt[k]{x}} \frac{1}{p_1\cdot\dotsc\cdot p_r} \\ &\sim \binom{k-1}{r}\frac{x(\log \log x)^{k-1}}{(k-1)!\,\log x}\,. \end{align}

Since $\sum_{r = 1}^{k-1} (-1)^{r-1}\binom{k-1}{r} = 1$, we find

$$\pi_k(x) \sim \frac{x(\log \log x)^{k-1}}{(k-1)!\,\log x}$$

for all $k \geqslant 1$.

The omitted steps showing that all other terms in the expansions are negligible are tedious, but pose no technical challenge.

It may be worth pointing out that due to the slow growth of $\log \log x$, the lower order terms can overshadow the dominant term for a long time. Except for small $k$, the asymptotics are useless in ranges where the $k$-almost primes can be explicitly counted.