I was given an example in class. It goes as follows:
Let the alphabet be $\{a, b\}$. $L$ is the string with one $b$ and at least two $a$'s Find a regular expression that generates this language.
This was my professor's answer: $r = aaa^*ba^*+abaa^*+baaa^*$
My question with this solution is:
Wouldn't this regular expression FORCE $aa$, $aba$, and $baa$ to be together, respectively? Like, there doesn't seem to be a constraint that $a$ must follow $a$ in the first term, or that $a$ must precede $b$ which must precede $a$ in the second, etc.
That's why this was my answer: $r = a^*aa^*aa^*ba^* + a^*aa^*ba^*aa^* + a^*ba^*aa^*aa^*$
It still forces there to be one $a$ and at least two $a$'s, but it doesn't make them be right next to each other.
Which solution is correct? I hope I explained myself correctly
Both answers are correct.
First note that the following regular expression works:
$$a^*aaba^*+a^*abaa^*+a^*baaa^*\;.$$
The first term catches every word of $L$ that has a pair of $a$s anywhere after the $b$, the second catches every word that has the $b$ between two $a$s, and the third catches every word that has two $a$s anywhere after the $b$. Of course, many words are caught by more than one term, but that’s perfectly all right. It doesn’t matter that I specify that the two $a$s are together: in any string of two or more $a$s, there are two that are adjacent, and I can always choose them to be the first two or the last two.
Your answer is simply an inefficient version of this one. Your $a^*aa^*aa^*ba^*$ is exactly equivalent to $a^*aaba^*$, both generating every string that has at least two $a$s before the $b$. Similarly, your $a^*aa^*ba^*aa^*$ is equivalent to $a^*abaa^*$, both generating every string that has at least one $a$ on each side of the $b$. And your third term is similarly equivalent to mine.
This can be simplified if we observe that either the $b$ comes first, in which case we must have $baaa^*$, or it comes last, in which case we must have $a^*aab$, or it comes between two $a$s, in which case we must have $a^*abaa^*$. Thus, the expression
$$baaa^*+a^*abaa^*+a^*aab$$
will work just fine.
Your professor simplified it a little differently. First, $aaa^*ba^*$ catches every word of $L$ that has at least two $a$s before the $b$. If a word of $L$ does not have two $a$s before the $b$, then either it starts $aba$, in which case it’s caught by $abaa^*$, or it starts $baa$, in which case it’s caught by $baaa^*$.