which of these languages are regular sets?

49 Views Asked by At

$$ L_1 = \{a^p b^q\ |\ p+q \ge 10^6\} \\ L_2 = \{a^m b^n\ |\ m-n \ge 10^6\} $$

According to me both of these languages require comparison between number of $a$'s and $b$'s so both of them should be non-regular, please correct me if I am wrong.

2

There are 2 best solutions below

2
On

You're correct about $L_2$, but $L_1$ is regular. Consider:

$$L_3 = \{a^p b^q\ |\ p+q < 10^6\}$$

This language is finite, and therefore regular. But the difference of two regular languages is regular, and $L_3 = a^*b^* - L_1$.

"Comparison" is the right intuition, but it's a little vague. If you want to be totally precise, there's the pumping lemma. The intuition that I use is that a language is not regular if a DFA for it needs to "remember" an unbounded amount of information.

In this way of thinking, we can see $L_2$ is not regular by considering what happens if $m$ and $n$ are both incredibly large, but still within $10^6$ of each other. The DFA has to read all of the $a$'s first, so in order to tell if there were enough $b$'s it would have to "remember" how many $a$'s there were.

By rearranging the inequality to $m \ge n + 10^6$, you can think of $L_2$ as essentially the same (off by a constant) as a fairly famous non-regular language:

$$\{a^mb^n \mid m \ge n\}$$

2
On

$L_1$ is a regular language.

Consider consider $p + q = 10^6$ for simplicity. You can easily construct a nondeterministic finite automaton that accepts one of the following:

  • 0 $a$ and $10^6$ $b$; or
  • 1 $a$ and $10^6 - 1$ $b$; or
  • ...
  • $10^6 - 1$ $a$ and 1 $b$; or
  • $10^6$ $a$ and 0 $b$

The case $p + q \ge 10^6$ is not harder to handle: if you encounter more than $10^6$ $a$, you are done; if you encounter the required number of $b$ plus some more $b$, you are done too.

There are many states, but still they are finitely many.


For $L_2$, rewrite it as follows:

$$L_2 = \{a^n b^m\ |\ n \ge m + 10^6\}$$

This language is very similar to $\{a^n b^n\ |\ n \ge 0\}$ (the difference is that $L_2$ needs to be followed by a few extra $b$), which is notoriously non-regular.