Show that, if $L$ is a regular language, then so is $\{w : \exists n \in \Bbb{N}, w^n \in L\}$

53 Views Asked by At

Suppose $L$ is a regular language over an alphabet $\Sigma$. Let $$L' = \{w : \exists n \in \Bbb{N}, w^n \in L\}.$$ Prove that $L'$ is regular too.

2

There are 2 best solutions below

0
On

We plan to construct an NFA with $\varepsilon$-moves that accepts $L'$, to show it is regular.

For all $n \in \Bbb{N}$, let $$L_n = \{w : w^n \in L\}.$$ We will show two things:

  1. $L' = \bigcup_{n=1}^N L_n$ for some $N \in \Bbb{N}$,
  2. Each $L_n$ is regular.

As is commonly known, the union of finitely many regular languages is regular. Thus, these two claims together prove that $L'$ is regular.

Let $(Q, \Sigma, \delta, q_0, F)$ be a NFA-$\varepsilon$ that accepts $L$, in the sense of the wikipedia article: $Q$ is the set of states, $\Sigma$ remains the alphabet, $\delta : (Q \times \Sigma \cup \{\varepsilon\}) \to \mathcal{P}(Q)$ is the transition function, $q_0 \in Q$ is the initial state, and $F \subseteq Q$ is the set of final states.

First claim

To prove the first claim, let $N = |Q|$, and suppose integer $n$ and string $w \in \Sigma^*$ exist such that $w^n \in L$. Without loss of generality, suppose $n$ is the least positive integer such that $w^n \in L$. We wish to show $n \le N$ by contradiction, hence we assume $n > N$.

Consider the path followed along our FSA-$\varepsilon$ in order to accept $w^n$. The path begins at $q_0$, of course, and after the first $|w|$ steps, it will end on a state $q_1 \in Q$, having spelled out the first copy of $w$. It then follows another $|w|$ steps, spelling out $w$ again, ending on some $q_2$. Continue this, until the final $w$ is spelled out, leaving us on some $q_n \in F$, as the FSA-$\varepsilon$ must accept $w^n$.

The finite sequence $q_1, \ldots, q_n$ consists of $n > |Q|$ states, all contained in $Q$. By the pigeonhole principle, this sequence must contain a pair of duplicates, that is, there exist some $i$ and $j$ such that $1 \le i < j \le n$ such that $q_i = q_j$.

If we cut out the loop between $q_i$ and $q_j$, we get a path through our NFA-$\varepsilon$ that accepts $w^{n - (j - i)}$, hence $w^{n - (j - i)} \in L$, where $n - (j - i) < n$. This contradicts the minimality assumed of $n$, hence we have a contradiction. Thus, $n \le N$, and so our first claim holds true.

Second claim

Now, fix $n \ge 1$. We wish to construct an NFA-$\varepsilon$ that accepts $L_n$. Let us define our new set of states: $$Q_n = (Q^n \times Q^{n-1}) \cup \{q_0^n\}.$$ Our new initial state shall be $q_0^n$. From $Q_n$, we define our final states: $$F_n = \{((q_1, q_2, \ldots, q_n), (q_1, q_2, \ldots, q_{n-1})) \in Q^n \times Q^{n-1} : q_n \in F\}.$$ We now proceed to define our state transition function $\delta_n : Q_n \times (\Sigma \cup \{\varepsilon\}) \to \mathcal{P}(Q_n)$. In our first case, define $$\delta_n(q_0^n, \varepsilon) = \{((q_0, q_1, q_2, \ldots, q_{n-1}), (q_1, q_2, \ldots, q_{n-1})) \in Q^n \times Q^{n-1}\},$$ remembering that $q_0$ is the initial state of our NFA-$\varepsilon$ accepting $L$. Next, $\delta_n$ will map $$\delta_n((q_1, q_2, \ldots, q_n), (r_2, r_3, \ldots, r_n), x) \ni ((q_1', q_2', \ldots, q_n'), (r_2', r_3', \ldots, r_n'))$$ if and only if $\delta(q_i, x) \ni q_i'$ for all $i$, and $(r_2, \ldots, r_n) = (r_2' \ldots, r_n')$. Map every other element of $Q^n \times Q^{n-1} \times (\Sigma \cup \{\varepsilon\})$ to the empty set.

I won't prove it rigorously, but the FSA-$\varepsilon$ $(Q_n, \Sigma, \delta_n, q_0^n, F_n)$ accepts $L_n$. Given any $w$ such that $w^n \in L$, as we did in the first claim, we can find a path that accepts $w^n$, and map out $q_1, q_2, \ldots, q_n$ as the ending states of the copies of $w$. To accept such a $w \in L^n$, we must simultaneously verify the $w$ from $q_0$ to $q_1$, as well as the $w$ from $q_1$ to $q_2$, etc.

We start our path in the new FSA-$\varepsilon$ at $q_0^n$, a dummy state, but then move, via $\varepsilon$-transition, to a state of the form $$((q_0, q_1, q_2, \ldots, q_{n-1}), (q_1, q_2, \ldots, q_{n-1})),$$ where the $Q^{n-1}$ part keeps track of the ending states of the first $n-1$ copies of $w$, while the $Q^n$ part keeps track of which state we are in simultaneously along the path when tracing out each copy of $w$. We transition to other states with the same $Q^{n-1}$ part (which exists only to verify the shreds of path join up at the end), but with each entry of the $Q^n$ part changing with the same transition.

Essentially, we run the original FSA-$\varepsilon$ $n$ times simultaneously (in the $Q^n$ bit), verifying each $w$ in $w^n$ simultaneously, leaving the $Q^{n-1}$ bit to make sure our threads join up at the end.

0
On

This question is much easier to solve if you use monoids instead of automata.

Proposition. The language $L'$ is recognized by the syntactic monoid${}^*$ of $L$.

It follows immediately that $L'$ is regular, since a language is regular if and only if its syntactic monoid is finite, but the proposition is much more precise. For instance, it shows that if $L$ is star-free, then so is $L'$, a result that would be challenging to prove with automata. The proposition is a corollary of a more general result on transductions, first given in [2] (see also [1] for further references). Here is a self-contained proof.

Let $M$ be the syntactic monoid of $L$, let $f:A^* \to M$ be its syntactic morphism and let $P = f(L)$. Then $$ (*) \quad L = f^{-1}(P) = f^{-1}(f(L)) $$ For each $m \in M$, let $m^*$ be the submonoid of $M$ generated by $m$. Observe that, if $m = f(u)$, then $m^* = f(u)^* = f(u^*)$, which justifies the notation $m^*$. Let $Q = \{m \in M \mid m^* \cap P \not= \emptyset\}$. I claim that $L' = f^{-1}(Q)$, which suffices to conclude. Indeed \begin{align} f^{-1}(Q) &= \{u \in A^* \mid f(u) \in Q \} = \{u \in A^* \mid f(u^*) \cap P \not= \emptyset \} \\ &\color{red} = \{u \in A^* \mid u^* \cap f^{-1}(P) \not= \emptyset \} = \{u \in A^* \mid u^* \cap L \not= \emptyset \} = L' \end{align} The only tricky part in this series of equalities is the red one. Suppose that $f(u^*) \cap P \not= \emptyset$. Then there exists $n \geqslant 0$ such that $f(u^n) \in P$. Thus $u^n \in f^{-1}(P)$, whence $u^* \cap f^{-1}(P) \not= \emptyset$. If now $u^* \cap f^{-1}(P) \not= \emptyset$, there exists $n \geqslant 0$ such that $u^n \in f^{-1}(P)$. It follows that $f(u^n) \in f(f^{-1}(P))$ and by $(*)$, $f(u^n) \in P$. Thus $f(u^*) \cap P \not= \emptyset$.

Note that, although the syntactic monoid of $L$ recognizes $L'$, its minimal automaton does not accept $L'$, even by changing the final states.

[1] J.-É. Pin, How to prove that a language is regular or star-free?, Proc. LATA 2020, LNCS 12038 (2020) 68-88

[2] J.-É. Pin and J. Sakarovitch, Some operations and transductions that preserve rationality, Proc. 6th GI Conference, Berlin, (1983), 277-288, LNCS 145, Springer.

${}^*\scriptsize\text{ Sorry to link to the French entry of Wikipedia, but it is preferable to the English one.}$