Proving Language is Non Regular Using Pumping Lemma

80 Views Asked by At

I am working on a question where I have the formal language Z over the alphabet Q {a, b, c} and it is generated by the context-free grammar whose non-terminals are S, A, and B, the start symbol is S, production rules are as follows:

(1) S → abSb
(2) S → A
(3) A → Bc
(4) B → cA
(5) A → a

The question asked to describe the structure of strings in the language and I came up with $Z = \{w ∈ Q^* | w = ab^n o u o b^m, u ∈ Q^*, n, m ∈ N^*\}$. ($o$ means concatenated here).

I am now struggling with the second part of the question where I am asked to show this language isn't regular using the pumping lemma. I understand the concept of the pumping lemma and I have applied to simpler questions but I have never used it for a language involving concatenation.

Most examples I have seen online have been of structure $L = \{0^n 1 0^n 1 | n ∈ N^*\}$. I am wondering if there is some rule I am missing that says the powers must be the same or if I have gone wrong in my definition of the language.

Thanks in advance :)

2

There are 2 best solutions below

3
On

Your suggested expression, $$Z = \{w ∈ Q^* \mid w = ab^n Q^* b^m\},$$

is wrong. As you suspected, the language is irregular only if there is something forcing the number of $\mathtt b$s to be the same. In this case, there is. If the language were actually $\mathtt{ab}^nQ^*\mathtt{b}^m$, it would be regular.

Hint: Note that the only way to eliminate $B$ from a term is to replace it with ${\mathtt c}A$. Let's produce an equivalent grammar that doesn't use $B$. To do this, we drop the rule $B \to {\mathtt c}A$ completely, and then instead of $A\to B{\mathtt c}$, have $A\to {\mathtt c}Ac$. (Do you understand why this works?)

Now the grammar is $$\begin{align} S&\to \mathtt{ab}S\mathtt{b}\\ S&\to A \\ A&\to \mathtt{c}A\mathtt{c}\\ A&\to \mathtt{a} \end{align} $$

Perhaps this will be easier to handle?

6
On

If you ignore rule 1) then you get a subset $L_c$ of the language consisting of strings $s_n = c^nac^n$.

For large enough $n$'s the pumping lemma should be applicable to any $s_n$, so some non-empty substring $y$ of $s_n$ can be replicated arbitrary number of times.

This substring cannot contain $a$ because having more than one $a$ is only possible by introducing $b$'s from rule 1), but the replication result cannot contain $b$'s.

This means that substring $y$ would be inside of one of $c^n$ substrings. But replicating such $y$ would disbalance $c$'s, i.e., the resulting substring would look $c^mac^n$ with $m \neq n$.

P.S. About the language and its strings: We can use (1) some number of times ($n$), but once we use (2) we can't never get back. This means that at some stage we have intermediate result $(ab)^nAb^n$. After that we can alternate between (3) and (4) some number ($m$) of times to get $(ab)^nc^mAc^mb^n$, but to finish we will eventually use (5) and end up with $(ab)^nc^mac^mb^n$, which is the format of all strings in $Z$.

P.S.2 There is an intuitive explanation why $Z$ is not regular. All regular languages can be parsed by finite automata. But after $(ab)^n$ is parsed the automaton must remember $n$ somehow so that it can check the number of $b$'s at the end. Since $n$ can be arbitrarily large one might need unlimited memory to remember it, which a finite automaton does not have.