Proving regular expression with induction - $(a \cup ba)^{\ast}b^{\ast}$

109 Views Asked by At

Let $L$ be the language $\{w \in \{a, b\}^{*} : bba$ is not a substring of $w\}$. We claim that $L$ is a regular language, and, for such, we will prove that the following regular expression $R = (a \cup ba)^{\ast}b^{\ast}$ describes $L$.

  • Proof. We will prove that $L(R) = L$.

  • Proof of $L(R) \subseteq L$.

Suppose that $w \in L(R)$, we will prove that $w \in L$ by induction on $|w|$. If $|w| = 0$ or $|w| = 1$ or $|w| = 2$, then, $w \in \{\epsilon\} \cup \{a, b\} \cup \{aa, ab, ba, bb\} \subseteq L$, therefore, since $bba$ cannot be a substring of $w$, it follows that $w \in L$.

If $|w| = 3$, then, we have the following possibilities: $w = aaa$, $w = aba$, $w = bab$, $w = bbb$, $w = baa$, $w = aab$, or $w = abb$. Since $bba$ does not occur in any of these possibilities, we have that $w\in L$.

If $|w| \geq 4$, then, we have the following possibilities: $w = aaaxy$, $w = abaxy$, $w = babxy$, $w = bbbxy$, $w = baaxy$, $w = aabxy$, or $w = abbxy$, where $x \in L((a \cup ba)^{\ast})$ e $y \in L(b^{\ast})$. Thus, we have the follwing cases:

  • Case $w = bbbxy$ or $w = abbxy$: Note that, the sufix of both possibilities is $bbxy$, thus, we have that $|bbxy| < |w|$, hence by the inductive step follows that $bbxy \in L$. Then, whenever $w = bbbxy$ or $w = abbxy$, it follows that $x = \epsilon$ to not have $bba$ as substring. Thus, $bba$ cannot be a substring of $w = bbbxy$ and cannot be a substring of $w = abbxy$, therefore, we can conclude that $bbbxy \in L$ e $abbxy \in L$.

  • Case $w = aabxy$ ou $w = babxy$: In a similar way to the previous case, we have that the suffix $abxy \in L$ by the inductive hypothesis, since $|abxy| < |w|$. Thus, the string $x$ cannot begin with $ba$, so none of the possibilities considered here will have $bba$ as substring, hence, we can conclude that $aabxy \in L$ e $babxy \in L$.

  • Case $w = aaaxy$ ou $w = abaxy$ ou $w = baaxy$: For each of these cases, we have $w \in L$ if and only if $xy \in L$, but since $xy \in L((a \cup ba)^{\ast}b^{\ast})$ and $|xy| < |w|$, we have $xy \in L$ by induction hypothesis. Thus, $w$ does not have substring $bba$, therefore, $w \in L$.

Since we exhausted all possibilities, given that we take an arbitrary $w$ of $L(R)$, then we can conclude that $L(R) \subseteq L$.

  • Proof of $L \subseteq L(R)$.

Suppose that $w \in L$, we will prove that $w \in L(R)$ by induction on $|w|$. If $|w| = 0$, or $|w| = 1$, or $|w| = 2$, by the definition of $L$ we have that $w \in \{a, b\}^{\ast}$, then, it follows that $w \in \{\epsilon\} \cup \{a, b\} \cup \{aa, bb, ab, ba\} \subseteq L(R)$. Hence, for any of these possibilities, we can conclude that $w \in L(R)$.

If $|w| \geq 3$, since $w \in \{a, b\}^{\ast}$, then we have the following possibilities: $w = aax$ or $w = bbx$ or $w = abx$ or $w = bax$, with $x \in L$. Since $w$ does not have substring $bba$, note that, when $w = bbx$, $w \in L$ if and only if $x$ ndoes not begin with $ba$ or $x$ does not begin with $a$. Similarly, when $w = ab$, $w \in L$ if and only if, $x$ does not begin with $ba$. Since, $|x| < |w|$, by the inductive hypothesis we have $x \in L(R)$ and therefore $x = uv$ with $u \in L((a \cup ba)^{\ast})$ and $v \in L(b^{\ast})$. Since $aau$, $ bbu$, $abu$ and $bau$ are in $L(R)$, we see that $w \in L(R)$, as we wanted to show.

Since we explored all possibilities, given that we take an arbitrary $w$ of $L$, then we can conclude that $L \subseteq L(R)$.

Therefore, since we showed that $L(R) \subseteq L$ e $L \subseteq L(R)$, we can conclude that $L(R) = L$. $\square$

Is my proof okay? I am always getting confused in the inductive hypothesis, if I always using the correct number of symbols when I assume that $|w| \geq k$ for some $k \in \mathbb{N}$.

1

There are 1 best solutions below

5
On BEST ANSWER

The first half of the proof is okay, but there’s a problem with the second half: in the last sentence of the $|w|\ge 3$ paragraph it’s not true that $bbu$ and $abu$ are in $L(R)$ unless $u=\epsilon$.

In this kind of argument I find it easier to to use the least counterexample style of induction argument, like this:

Suppose that there is some word in $L\big((a\cup ba)^*\big)$ that contains the substring $bb$, and let $w$ be one of minimal length. There is an $x\in L\big((a\cup ba)^*\big)$ such that $w=xa$ or $w=xba$, and clearly no word in $L\big((a\cup ba)^*\big)$ ends in $b$, so any $bb$ substring of $w$ must be contained in $x$. However, $|x|<|w|$, so by hypothesis $x$ does not contain the substring $bb$. This contradiction shows that in fact no word in $L\big((a\cup ba)^*\big)$ has a substring $bb$.

Now $w\in L(R)$; then $w=xy$ for some $x\in L\big((a\cup ba)^*\big)$ and $y\in L(b^*)$. Clearly there is no $a$ in $y$, so $x$ must have a substring $bba$. But then $x$ has a substring $bb$, which is impossible. Thus, no word in $L(R)$ contains a substring $bba$, and therefore $L(R)\subseteq L$.

If $L\setminus L(R)\ne\varnothing$, let $w$ be word in $L\setminus L(R)$ of minimal length. Clearly $w\ne\epsilon$, so we can write $w=x\sigma$, where $\sigma\in\{a,b\}$ and $x\in L$. (Clearly every substring of $w$ belongs to $L$.) The minimality of $|w|$ implies that $x\in L(R)$, so $x=uv$ for some $u\in L\big((a\cup ba)^*\big)$ and $v\in L(b^*)$. There are now several possibilities.

If $\sigma=b$, then $v\sigma\in L(b^*)$, and $w=uv\sigma\in L(R)$, contrary to hypothesis.

If $\sigma=a$, $x$ cannot end in $bb$, so either $x=\epsilon$, $x=b$, $x$ ends in $a$, or $x$ ends in $ab$. The first two cases are ruled out: $w$ would be $a$ or $ba$, both of which are in $L(R)$. If $x$ ends in $a$, then $v=\epsilon$, so $$w=xa=ua\in L\big((a\cup ba)^*\big)\subseteq L(R)\,,$$ again contrary to hypothesis.

The only remaining possibility is that $x$ ends in $ab$, i.e., that $w$ ends in $aba$. In that case let $w=zaba$ (i.e., $x=zab$). Then $za\in L$, and $|za|<|w|$, so $za\in L(R)$. Clearly $za\in L\big((a\cup ba)^*\big)$, so $$w=zaba\in L\big((a\cup ba)^*\big)\subseteq L(R)\,,$$ yet again contrary to hypothesis. It follows that no such word $w$ exists, so $L\subseteq L(R)$, and therefore $L=L(R)$.