Let $L$ be the language $\{w \in \{a, b\}^{*} : bba$ is not a substring of $w\}$. We claim that $L$ is a regular language, and, for such, we will prove that the following regular expression $R = (a \cup ba)^{\ast}b^{\ast}$ describes $L$.
Proof. We will prove that $L(R) = L$.
Proof of $L(R) \subseteq L$.
Suppose that $w \in L(R)$, we will prove that $w \in L$ by induction on $|w|$. If $|w| = 0$ or $|w| = 1$ or $|w| = 2$, then, $w \in \{\epsilon\} \cup \{a, b\} \cup \{aa, ab, ba, bb\} \subseteq L$, therefore, since $bba$ cannot be a substring of $w$, it follows that $w \in L$.
If $|w| = 3$, then, we have the following possibilities: $w = aaa$, $w = aba$, $w = bab$, $w = bbb$, $w = baa$, $w = aab$, or $w = abb$. Since $bba$ does not occur in any of these possibilities, we have that $w\in L$.
If $|w| \geq 4$, then, we have the following possibilities: $w = aaaxy$, $w = abaxy$, $w = babxy$, $w = bbbxy$, $w = baaxy$, $w = aabxy$, or $w = abbxy$, where $x \in L((a \cup ba)^{\ast})$ e $y \in L(b^{\ast})$. Thus, we have the follwing cases:
Case $w = bbbxy$ or $w = abbxy$: Note that, the sufix of both possibilities is $bbxy$, thus, we have that $|bbxy| < |w|$, hence by the inductive step follows that $bbxy \in L$. Then, whenever $w = bbbxy$ or $w = abbxy$, it follows that $x = \epsilon$ to not have $bba$ as substring. Thus, $bba$ cannot be a substring of $w = bbbxy$ and cannot be a substring of $w = abbxy$, therefore, we can conclude that $bbbxy \in L$ e $abbxy \in L$.
Case $w = aabxy$ ou $w = babxy$: In a similar way to the previous case, we have that the suffix $abxy \in L$ by the inductive hypothesis, since $|abxy| < |w|$. Thus, the string $x$ cannot begin with $ba$, so none of the possibilities considered here will have $bba$ as substring, hence, we can conclude that $aabxy \in L$ e $babxy \in L$.
Case $w = aaaxy$ ou $w = abaxy$ ou $w = baaxy$: For each of these cases, we have $w \in L$ if and only if $xy \in L$, but since $xy \in L((a \cup ba)^{\ast}b^{\ast})$ and $|xy| < |w|$, we have $xy \in L$ by induction hypothesis. Thus, $w$ does not have substring $bba$, therefore, $w \in L$.
Since we exhausted all possibilities, given that we take an arbitrary $w$ of $L(R)$, then we can conclude that $L(R) \subseteq L$.
- Proof of $L \subseteq L(R)$.
Suppose that $w \in L$, we will prove that $w \in L(R)$ by induction on $|w|$. If $|w| = 0$, or $|w| = 1$, or $|w| = 2$, by the definition of $L$ we have that $w \in \{a, b\}^{\ast}$, then, it follows that $w \in \{\epsilon\} \cup \{a, b\} \cup \{aa, bb, ab, ba\} \subseteq L(R)$. Hence, for any of these possibilities, we can conclude that $w \in L(R)$.
If $|w| \geq 3$, since $w \in \{a, b\}^{\ast}$, then we have the following possibilities: $w = aax$ or $w = bbx$ or $w = abx$ or $w = bax$, with $x \in L$. Since $w$ does not have substring $bba$, note that, when $w = bbx$, $w \in L$ if and only if $x$ ndoes not begin with $ba$ or $x$ does not begin with $a$. Similarly, when $w = ab$, $w \in L$ if and only if, $x$ does not begin with $ba$. Since, $|x| < |w|$, by the inductive hypothesis we have $x \in L(R)$ and therefore $x = uv$ with $u \in L((a \cup ba)^{\ast})$ and $v \in L(b^{\ast})$. Since $aau$, $ bbu$, $abu$ and $bau$ are in $L(R)$, we see that $w \in L(R)$, as we wanted to show.
Since we explored all possibilities, given that we take an arbitrary $w$ of $L$, then we can conclude that $L \subseteq L(R)$.
Therefore, since we showed that $L(R) \subseteq L$ e $L \subseteq L(R)$, we can conclude that $L(R) = L$. $\square$
Is my proof okay? I am always getting confused in the inductive hypothesis, if I always using the correct number of symbols when I assume that $|w| \geq k$ for some $k \in \mathbb{N}$.
The first half of the proof is okay, but there’s a problem with the second half: in the last sentence of the $|w|\ge 3$ paragraph it’s not true that $bbu$ and $abu$ are in $L(R)$ unless $u=\epsilon$.
In this kind of argument I find it easier to to use the least counterexample style of induction argument, like this: