Size of automata or regular expressions avoiding cross patterns

95 Views Asked by At

Let $\Sigma$ be an alphabet of finite size $k$, and $n$ some integer. I am interested in the language of words of size $n$ that do not contain $abab$ as a subword, for any pair $(a,b) \in \Sigma$ (I insist that I care about subwords and not factors). This language can be described more formally as

$$L=\Sigma^n \setminus (\bigcup_{a \neq b \in \Sigma^2} \Sigma^*a \Sigma^* b \Sigma^* a \Sigma^* b \Sigma^*)$$

Now, obviously this is a finite language, so it is easy to build such an automaton, but its size could a priori be exponential in $n$ while I would like a polynomial dependency.

Questions: Is there a polynomial-sized deterministic automaton recognizing $L$? Is there a polynomial-sized nondeterministic one? Is there a polynomial-sized regular expression for this language?

Note that taking the usual product automaton construction yields something that grows like $c^{k(k-1)/2}$, which is too much if $k=\Theta(n)$ for example, which is the regime I am interested in.

If the answers to the previous questions are negative, I am also interested in the same question for context free grammars and pushdown automata.

Edit: For every subset $S$ of $\Sigma$, the languages accepted after having read the ordered word corresponding to $S$ are different, so that gives an exponential lower bound for deterministic regular automata. I am still curious about the other cases though.