Proving that a language is not context-free

315 Views Asked by At

Given the language $$L = \{ a^p \mid p\, \text{IS NOT prime} \}$$ is $L$ Context free? If not, prove that it's not.

May I have some suggestions on how to use the pumping lemma to prove this, please?

Thanks.

3

There are 3 best solutions below

0
On

I don't know if you were asked to use necessarily the pumping lemma in order to solve the problem, but in this case of language, it is much easier to use Parikh's theorem.

The alphabet of the language is $\Sigma=\{a\}$. So, for every word $w$, Parikh's vector is $P(w)=|w|_{a}$ (which is the number of occurences of $a$ in $w$). Then, the set of Parikh's vectors for language $L$ of the exercise is the set of divisible numbers. $$S=\{P(w):w\in L\}=\{|a^p|: p~\text{not prime}\}=\{p:p~\text{not prime}\}$$

It is easy to show that this set is not semi-linear, since no finite union of linear sets can result to $S$. If there was such a union, a divisible number would be missing, leading to contradiction.

0
On

The pumping lemma can be used to solve this problem.

We need to show: for all claimed pumping lengths $p\in\mathbb{N}$, there is some string $s$ of length $\ge p$ that cannot be split into the shape $s=uvwxy, |vx|\ge 1, |vwx|\le p$ such that $\forall n\in\mathbb{N}: uv^nwx^ny\in L$.

Fix a "pumping length" $p$. Let $q$ be the least composite number that has no prime factor $\le p$ (it's worth saying that such a composite number exists: you could take $r$ to be the least prime $> p$ and then $r^2$ has no prime factor $\le p$, so $q\le r^2$ exists).

Take the string $s=a^q$. No matter what split $s=uvwxy$ occurs, observe that $uv^nwx^ny=a^{q+(n-1)|vx|}$ i.e. repeating $v,x$ will just correspond to adding $|vx|$-many characters $a$. Let $m:=|vx|$. As $a^q\in L$, we need to show that some string $a^{q+(n-1)m}$ is not in $L$.

So, our proof is complete if we can show that, in general, a sequence $q,q+m,q+2m,q+3m,\dots$ (for $q$ prime, $1\le p<q$) must contain a prime number in it.

Dirichlet's theorem gives that every arithmetic sequence contains infinitely many primes if the first term and common difference are coprime. We chose $q$ to have no prime factors $\le p$, and $m\le |vwx|\le p$, so $q,m$ are coprime. Hence, $L$ is not context-free.


Just for interest, that $\bar{L}$ is also not context-free would require a similar statement, that a (slightly constrained) arithmetic sequence contains at least one non-prime, and in fact this and a much stronger statement is also true, due to a different theorem sometimes also called "Dirichlet's theorem".

0
On

Here is another proof:

  • $\{a^p\mid p\in \mathbb{P}\}$ is not a context-free language. This can be proven using the pumping lemma, but the proof is easier than for $\{a^n\mid n\notin\mathbb{P}\}$. Basically, if $u = a^p = vwxyz$ is a decomposition, then $vw^{p+1}xy^{p+1}z = a^{p+p|wy|}$ and $p(1 +|wy|)$ is not prime.
  • Context-free languages over unary alphabet are regular (this is another application of Parikh's theorem, but see here for another proof).
  • Regular languages are closed under complement.

Therefore, if $L = \{a^n\mid n\notin \mathbb{P}\}$ is context-free, then it is regular, and so is its complement $\{a^p\mid p\in \mathbb{P}\}$, but this is a contradiction.