At first glance, it seems like the pumping lemmas should somehow "easily" show that an infinite set of primes (say, written in binary) cannot be a regular language or context-free grammar. But I don't see how to prove it because the part of the binary expansion being pumped can be in the middle of the string. Is it known whether an infinite set of primes can be a regular language or CFG, and how to show it?
Can an infinite set of primes be a regular language or CFG?
2.2k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
First, note that the answer depends in the representation you use! For example, you could represent numbers over the alphabet $\{a,b\}$ by writing $$ "\underbrace{a\ldots a}_{e_1\text{ times}}b\underbrace{a\ldots a}_{e_2\text{ times}}b\ldots\underbrace{a\ldots a}_{e_n\text{ times}}" \text{ for the number } \sum_{k=1}^n p_k^{e_k} \text{ where $p_k$ is the $k$-th prime.} $$ Then, a number is prime exactly if the representation contains a single $a$.
For a unary represetation of numbers, you can easily show using the pumping lemma that if there's an infinite regular set $A$ of primes, then there are $a,b$ such that $n = a + tb$ is prime for every $0 < t \in \mathbb{N}$. But that's impossible, because for $t=a$, you get $n = a + a\cdot b = a(1 + b)$ which clearly isn't prime (You also have to check the case $a=0$, but then $tb$ clearly isn't prime as well).
For a binar representation of numbers, things get more messy. Let $b_0\ldots b_n$, $b_i \in \{0,1\}$ be the binary represention of $b$, i.e. $b = \sum_{k=0}^n b_i2^i$. If there's a regular infinite set of primes $A$, it follows from the pumping lemma that we can find $b_0\ldots b_k\ldots b_m\ldots b_n$ such that all the numbers represented by $$ b_0\ldots b_{k-1}\underbrace{(b_k \ldots b_{m-1})(b_k\ldots b_{m-1})\ldots }_{\text{$t$ times}}b_m\ldots b_n $$ are prime. Put different, there's a prime $p$ such that $$ (\star)\quad p_1 + 2^kp_2\sum_{i=1}^t 2^{(m-k)i} + 2^mp_3 \text{ is prime for all $0 < t \in \mathbb{N}$,} $$ where $$ p_1 = p\text{ mod } 2^k,\, p_2 = 2^{-k}(p - p_1 \text{ mod }2^m),\, p_3 = 2^{-m}(p - p_1 - 2^kp_2) \text{.} $$ So now you have to derive some contradiction from $(\star)$. I don't immediately see a short way to do that, though.
Assume $P$ is your infinite set of primes, and we are interested in the language writing this in a base $B$. We prove the language isn't context free by contradicting the respective pumping lemma. We will use variables to indicate numerical values and the strings in base $B$ representing them interchangeably for conciseness. In particular, $\lvert a \rvert$ denotes the length of the base $B$ representation of $a$.
Assume the language is context free, so it satisfies the respective pumping lemma. Let $N$ be the constant of the lemma, $p \in P$ such that $\lvert p \rvert \ge 2 N$. $P$ being infinite such $p$ certainly exists.
By the pumping lemma, we can write (as strings): $$ p = u v w x y $$ with $\lvert v \rvert + \lvert w \rvert + \lvert x \rvert \le N$, $\lvert v \rvert + \lvert x \rvert \ne 0$ such that for all $k \in \mathbb{N}_0$ we have (as strings) $u v^k w x^k y$ is in the language, in particular, it is a prime.
Translate into numbers now (and forget the strings). Calling $\alpha = \lvert v \rvert$ and $\beta = \lvert x \rvert$ the string pumped $k \ge 1$ times represents: $$ p^{(k)} = u \cdot B^{ (\alpha + \beta) k + a} + v \cdot B^{\beta k + b} \cdot \frac{B^{\alpha k} - 1}{B^\alpha - 1} + w \cdot B^{\alpha k + c} + x \cdot B^d \cdot \frac{B^{\beta k} - 1}{B^\beta - 1} + y $$ for some $a$, $b$, $c$, and $d$. If $v$ or $x$ are empty, the respective term is missing from on now. By Fermat's little theorem, $A^p \equiv A \pmod{p}$ for all $A$. Also $B^\alpha$ and $B^\beta$ are less than $p$, and so relatively prime to it. Now: $$ p^{(p)} \equiv u \cdot B^{ \alpha + \beta + a} + v \cdot B^{\beta + b} \cdot \frac{B^\alpha - 1}{B^\alpha - 1} + w \cdot B^{\alpha + c} + x \cdot B^d \cdot \frac{B^\beta - 1}{B^\beta - 1} + y \equiv p \equiv 0 \pmod{p} $$ Contradiction.