Regular Expression for all string over {a,b} without abb

1.1k Views Asked by At

I am trying to come up with a regular expression that excludes the string 'abb' ONLY. All other combinations are possible.

So far I have: b*+a*(aaba)+a(a*ba)a+a(ba)*(a+ab)a+a(ab)(a+ab).

Which is a bit long.

1

There are 1 best solutions below

1
On

This is a DFA accepting the strings over $\Sigma=\{A,B\}$ not having $ABB$ as a substring:

____ / A _______A____ \ // \ Start--A-> [Prefix A] --B-> [Prefix AB] --B-> Discard \ ^ B A ________ \ | / \ --> [Prefix B*] --B->/

After reading the first character we either stay in the bottom line forever, or we reach the states in the first line, accepting cycles like $A^+$ and $(BA)^+$, for finishing (at $\text{[Prefix A]}$) with an $A$ or (at $\text{[Prefix AB]}$) with a $B$. It follows that

$$B^*(A\{\varepsilon,B\})^*$$ (thanks to Fabio Somenzi for spotting an inaccuracy in the previous version of this answer) is a more compact representation of the strings accepted by the DFA above.


The transition matrix of the DFA can be represented as $$ \begin{pmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 0\end{pmatrix}$$ having eigenvalues $\frac{1\pm\sqrt{5}}{2},1$. $L_n$, the number of strings with length $n$ fulfilling the wanted constraint, is so $\lambda_0 + \lambda_1 \varphi^n+\lambda_n \overline{\varphi}^n$ with $\varphi$ being the golden ratio. Since $L_1=2, L_2=4, L_3=7$ we simply have $L_n = \color{red}{F_{n+3}-1}$ with $F_n$ being the $n$-th Fibonacci number.