Regex for strings with no three identical consecutive characters

19.1k Views Asked by At

I wanna ask what the regular expression for the strings having the property in the title should be. For binary string with no three consecutive 0, it's quite a simple regex (1|01|001)*|(0|00|$\epsilon$) but I found its very difficult for a normal alphabet string. Do you have any suggestion/solution for that?

Note: Those strings don't contain any three identical consecutive characters. For example, they do not have $aaa$, $bbb$, $ccc$, etc. in their content.

Thanks,

2

There are 2 best solutions below

1
On BEST ANSWER

It isn't easy to product a complement of regular expression. Very long and exhausting calculations :(

For alphabet $\{a,b\}$ (and if I didn't miscalculate) it is something like

((a|ba|bba)(aba|abba|ba|bba)*(abb|ab|bb|a|b|$\varepsilon$))|bb|b|$\varepsilon$

which is ... bulky.

You need to convert your regex to DFA, then turn all non-terminal state into terminal states and vice versa, then convert the resulting DFA to regex back.

And here there is a well-done example of it.

1
On

This is very difficult to have a small regexp using only sum operator | and star *, but in javascript you can do something like :

((.)\2?(?!\2))*
  1. (.) is the second group and match any letter
  2. \2? is an optional repetition of the previous letter
  3. (?!\2) is a condition : the same letter does not follow this expression

It's much easier because you can capture group and force them not to be followed by some expression.