Proving that a regular expression belongs to a certain language

1k Views Asked by At

I've often come by questions that require proving that a certain regular expression belongs to that language.

Example:

Given $\Sigma = \{0,1\}$, and the language $L$ of all the words that have the combination $01$ only (and exactly) once, and a regular expression $r = (1^*)(0^{*})01(1^*)(0^*) $, prove that $L(r) = L$.

How can we prove such a thing? is it induction?

2

There are 2 best solutions below

0
On BEST ANSWER

As J.-E. Pin already noted, you can’t say that $r$ is in $L$: members of $L$ are strings of zeroes and ones. What you mean is that $r$ generates or represents $L$. To show that this is the case, you must show that $L(r)=L$, which you can do by showing that $L(r)\subseteq L$ and $L\subseteq L(r)$.

It’s clear that any word matching $r=(1^*)(0^*)01(1^*)(0^*)$ contains at least one instance of $01$, and you shouldn’t have too much trouble explaining why it contains only one instance of $01$; that will show that $L(r)\subseteq L$. To show that $L\subseteq L(r)$, suppose that $w\in L$. Then $w=u01v$ for some $u,v\in\Sigma^*$ such that neither $u$ nor $v$ contains any instance of $01$. If you can show that the regular expression $s=(1^*)(0^*)$ represents the language of all binary strings that do not contain the substring $01$, you’ll be done: you’ll know that $u,v\in L(s)$ and therefore that $w\in L(r)$. I’ve left an informal proof of that result in the spoiler-protected block below.

Let $L_1$ be the language of all binary strings that do not contain the substring $01$. The key observation is that if $x=a_1a_2\ldots a_n\in L_1$, and there are $k,\ell$ such that $1\le k<\ell\le n$, $a_k=0$, and $a_\ell=1$ (i.e., if $x$ has at least one $0$ that comes before at least one $1$), then $x$ necessarily contains a substring $01$. (Why?) Thus, words of $L_1$ must have all of their $1$’s before all of their $0$’s: they must be in $L(s)$. On the other hand, it’s clear that $L(s)\subseteq L_1$, so $L_1=L(s)$.

0
On

You cannot say that a regular expression belongs to a language. A regular language represents a language. So you can say that the language represented by a regular expression is [a subset of, equal to] another language.

Coming back to your question, two hints: it should not be difficult to prove that $L(r) \subseteq L$. To prove the opposite inclusion, you should first compute the minimal DFA of $L$.