Trying to convert DFA to regular expression

700 Views Asked by At

I'm trying to write a regular expression from this DFA but I'm having some trouble.

enter image description here

I can tell you what I've done so far: I started by adding a new beginning state and a new final state because initial states cannot have incoming edges and made a new final state because they can't have out going edges. From there I started eliminating states but I can't come up with an expression that works so I'm going to start over. Can someone help me come up with a regular expression that represents this DFA. Thanks.

3

There are 3 best solutions below

0
On

You begin with node $1$, and either stop, or you then go to node $2$ one or more times, then you go to node $3$ once and either end there, return to the previous step, or return to the first step.

So it's either just $1$, or else it is one or more of: ($1$, followed by one or more of: (one or more of $2$, followed by one $3$)) that is maybe followed by a final $1$.

$$[1\mid(1(2{+}3){+}){+}1?]$$

[edit: forgot the double circles indicate possible end states.]

0
On

You can either go from state $1 \to 2 \to 3 \to 1$ zero or more times or you can go from state $1 \to 2 \to 3$.

For the first case you have $(a|b)a^*b(bb)^*a$ for one loop which becomes $((a|b)a^*b(bb)^*a)^*$ to go around zero or more times. That also covers the case where you stay in the initial state.

For the second case you have $(a|b)a^*b(bb)^*$

If you "or" the two cases together you get your regular expression.

0
On

DFA to RE conversion can be done systematically, using what is essentially linear algebra. For each $i$, let $L_i$ be the language of words starting in state $s_i$ and ending in an accept state. The DFA gives us the following system of "linear equations" in the $L_i$'s: $$\begin{eqnarray} L_0 &=& b L_1 + a L_2\\ L_1 &=& a L_1 + b L_2 + \epsilon\\ L_2 &=& (a+b) L_2 + \epsilon\\ \end{eqnarray}$$ where $\epsilon$ denotes the language consisting of just the empty word, and $+$ is union, which we'll eventually write as "|". (For example, a word in $L_1$ either starts with $a$ (in which case what's left over is in $L_1$), with $b$ (in which case what's left over is in $L_2$) or is empty (since $s_1$ is an accept state).)

These equations can be solved by successively eliminating all $L_i$'s other than $L_0$, which is the one we want since $s_0$ is the initial state. The key fact which makes this easy is the following: if $A$ and $B$ are languages, then the unique language $X$ satisfying $X=AX+B$ is $X=A^*B$.

We start by eliminating $L_2$ using the last equation; the key fact tells us $$L_2=(a+b)^*\epsilon = (a+b)^*.$$ Thus the middle equation becomes $L_1=a L_1+ b(a+b)^* + \epsilon$, whence $$L_1=a^*( b(a+b)^*+\epsilon),$$ and finally the first equation gives $$L_0=b a^*( b(a+b)^*+\epsilon) + a (a+b)^*.$$ In more standard regular expression notation, this is $$b a^*(b[ab]^*)?\,|\,a[ab]^*$$