Unusual pattern in the distribution of primes of form $4n\pm1, 6n\pm 1$ [Chebyshev's bias]

1.7k Views Asked by At

I have recently noticed an unusual pattern in the distribution of odd primes.

Each one of the following sets contains approximately half of all odd primes:

  • $A_n=\{4k+1: 0\leq k\leq n\}=\{1,5,9,13,\dots,4n+1\}$
  • $B_n=\{4k+3: 0\leq k\leq n\}=\{3,7,11,15,\dots,4n+3\}$
  • $C_n=\{6k+1: 0\leq k\leq n\}=\{1,7,13,19,\dots,6n+1\}$
  • $D_n=\{6k+5: 0\leq k\leq n\}=\{5,11,17,23,\dots,6n+5\}$

More precisely:

  • Let $P(S)$ denote the number of odd primes in the set $S$
  • Let $\pi(x)$ denote the number of odd primes smaller than $x$

Then for every sufficiently large value of $n$:

  • ${P(A_n)}\approx{P(B_n)}\approx\frac12\pi(4n+4)$
  • ${P(C_n)}\approx{P(D_n)}\approx\frac12\pi(6n+6)$

Now, all of this is pretty easy to observe (though probably not so easy to prove).

The following facts are subsequently obvious for every sufficiently large $n$ as well:

  • ${P(A_n)}\leq{P(B_n)}\implies{P(A_n)}\leq\frac12\pi(4n+4)\leq{P(B_n)}$
  • ${P(A_n)}\geq{P(B_n)}\implies{P(A_n)}\geq\frac12\pi(4n+4)\geq{P(B_n)}$
  • ${P(C_n)}\leq{P(D_n)}\implies{P(C_n)}\leq\frac12\pi(6n+6)\leq{P(D_n)}$
  • ${P(C_n)}\geq{P(D_n)}\implies{P(C_n)}\geq\frac12\pi(6n+6)\geq{P(D_n)}$

This is because $A_n$ and $B_n$ as well as $C_n$ and $D_n$ are "complementary" to each other:

  • The set ${A_n}\cap{B_n}$ is empty, and the set ${A_n}\cup{B_n}$ contains all odd primes smaller than $4n+4$
  • The set ${C_n}\cap{D_n}$ is empty, and the set ${C_n}\cup{D_n}$ contains all odd primes smaller than $6n+6$

Nevertheless, for almost every value of $n$:

  • ${P(A_n)}\leq{P(B_n)}$
  • ${P(C_n)}\leq{P(D_n)}$

The graphs and table below provide some empirical evidence:

enter image description here enter image description here enter image description here

 range     | odd primes | cases where either P(A)>P(B) or P(C)>P(D)
-----------|------------|-------------------------------------------
 10000     | 1228       | 0
 100000    | 9591       | 1
 1000000   | 78497      | 239
 10000000  | 664578     | 239
 100000000 | 5761454    | 1940

I would expect primes to be equally distributed between $A_n$ and $B_n$ and between $C_n$ and $D_n$.

In other words, I would expect:

  • [Number of primes of the form $4k+1$] $\approx$ [Number of primes of the form $4k+3$]
  • [Number of primes of the form $6k+1$] $\approx$ [Number of primes of the form $6k+5$]

But since the empirical evidence above suggests otherwise, my questions are:

  1. Is this indeed the case, or do they become equally distributed on a larger range?
  2. If this is indeed the case, what research has been conducted attempting to explain it?

Thanks

3

There are 3 best solutions below

4
On BEST ANSWER

The phenomenon you observe is real. This is known, the case for $4$, as Chebyshev's bias Another relevant keyword is Shanks–Rényi race problem.

"Prime number races" by Granville and Martin is a fantastic introduction to this circle of ideas.

But, let me include some basic information here (more-or-less self-plagiarizing an MO-answer).

On a rough scale the frequency counts of primes congruent $1$ and $3$ modulo $4$ are the same; both counting functions are asymptotic to $\frac{1}{2} \text{li}(x)$ with error terms essentially as commonly know from the prime counting function. This is the well-know Prime Number Theorem for arithmetic progressions (mentioned by Dietrich Burde).

However, if one compares the exact counts of primes congruent to $1$ and $3$ modulo $4$ respectively, let me call the respective counting functions $\pi_1(x)$ and $\pi_3(x)$, then one notes (at least at the start) that there are more congruent to $3$ than congruent to $1$, so $\pi_3(x) > \pi_1(x)$, an observation made by Chebyshev. However, Littlewood showed that the difference $\pi_3(x) - \pi_1(x)$ can also be negative, and even is infinitely often essentially as negative as it can get (under the assumption that both should not deviate from $\text{li}(x)/2$ by more than $\sqrt{x}$ and a little).

So, now one might think that one just came across a phenomenon of small numbers, as you said, with this initial bias however this is not so there is a bias in the distribution.

If one defines $P$ to be the set of all integers such that $\pi_3(x) > \pi_1(x)$ then these are not "half of the integers". Rubinstein and Sarnak proved (under widely believe conjectures on zeros of L-functions, GRH and GSH) that the logarithmic density of this set, that is the limit of $$ \frac{1}{\log x} \sum_{n \in P} \frac{1}{n} $$
is $0.9959...$, so quite close to $1$ though not equal to $1$ so "almost all" is perhaps too strong.

2
On

The Dirichlet density of primes $p\equiv 1 \bmod 4$ and of primes $p\equiv 3 \bmod 4$ is both equal to $\frac{1}{2}$, but the number of primes up to $x$ in both classes can differ. This is what is meant by Prime number races ( see the answer of quid).

0
On

I've come across something that I would expect to provide to provide evidence for the fact that more primes can be written as 6n+5 than 6n+1. Hopefully the way that I describe this will make sense, though I am not expecting it to be perfectly legibile. First, its important to note that any number 6n+3 is divisible by three and therefore not prime. Also, 6n+2 and 6n+4 are also not prime because they are even. Therefore the set of primes that can be written as 6n+1 is equivalent to the set of primes that can be written as 3n+1 (bear with me here). In a similar way, the set of primes that can be written as 6n+5 is equivalent to the set of primes that can be written as 3n-1. All numbers that can be written as either 3n+1 or 3n-1 will be prime, a power of a prime, or a composite number. Any number 3n+1 raised to a power will give another number 3m+1. Any number 3n-1 raised to an odd power will give another number 3m-1, but raised to an even power would give a number 3m+1. It would therefore make sense that fewer numbers 3n+1 will be prime, compared to those written 3n-1. This thinking, however, assumes that composite numbers are equally distributed between numbers 3n+1 and 3n-1.