Set of all factors of regular language regular?

500 Views Asked by At

If $L$ is a regular language (i.e. acceptable by a finite automata), is the the set of its factors (infixes) also regular?

2

There are 2 best solutions below

3
On BEST ANSWER

Short answer. This is a consequence of the fact that regular languages are closed under left and right quotients by any language.

Given languages $L$, $X$ and $Y$ of $A^*$, let $$ X^{-1}LY^{-1} = \{u \in A^* \mid \text{there exist $x \in X$ and $y \in Y$ such that $xuy \in L$}\} $$ I let you verify that $(X^{-1}L)Y^{-1} = X^{-1}(LY^{-1}) = X^{-1}LY^{-1}$ so there is no ambiguity with this notation.

Theorem. If $L$ is regular, then $X^{-1}LY^{-1}$ is regular for any $X$ and $Y$ (regular or not).

In particular, the set of all factors of $L$, which is equal to $(A^*)^{-1}L(A^*)^{-1}$, is regular.

Proof of the theorem. Observe that $$ X^{-1}LY^{-1} = \bigcup_{x\in X, y\in Y} x^{-1}Ly^{-1} = \bigcup_{x\in X, y\in Y} (x^{-1}L)y^{-1} $$ Now if $L$ is regular, Myhill Nerode theorem tells you that $L$ has finitely many quotients of the form $x^{-1}L$ and all these quotients are regular. Dually, they are only finitely many quotients of the form $Ly^{-1}$ and finally only finitely many quotients of the form $x^{-1}Ly^{-1}$, all regular again. Since regular languages are closed under finite union, $X^{-1}LY^{-1}$ is regular.

0
On

We can proceed by induction on the regularity of the language. There are $5$ cases to consider:

  • The empty language. Well, in this case, the infixes would also constitute the empty language, which is regular.
  • A singleton language. The infixes would constitute just the singleton language again, so we're done here. If the empty string is allowed, then we're still OK because all finite languages are regular.
  • For union $A \cup B$, we notice that any term in the language is in either in $A$ or $B$ and we proceed by the inductive hypothesis.
  • For concatenation $A . B$ we notice that an infix is either an infix of $A$, an infix of $B$ or a suffix of $A$ and a prefix of $B$. The first two proceed by inductive hypothesis. Since regularity closure is known for prefixes and suffixes, the other case works.
  • Finally, for Kleene-star $A^*$, there are $2$ cases to consider. The first is that every infix term is a suffix of $A$, followed by $A^*$, followed by a prefix of $A$. From the known prefix/suffix closures of regular languages, and concatenation, we can combine these into a regular language. For the other case, we have that the infix is an infix of $A$ itself. In this case, we get our result by induction. Then we just use union to put the two cases together into a regular language*.

*This case may need more careful formal treatment than I have given it.