How does regex (0*1*)* denote all binary strings?

6.5k Views Asked by At

Correction Edit: I want to know how (0*1*)* denotes all binary strings, not (0*1)*.

Just started learning about regex. I was told

Language denoted by the regex( (0+1)* )

denotes all binary strings.

My understanding is,

L(R + S) = L(R) Union L(s) 

so

L(0+1) = L(0) U L(1) = {0, 1} 

and when you add the * i.e. {0, 1}*, it can be a mix of 0s and 1s in however order (1 can come before the 0), and however many times. That is why it denotes all binary strings.

I was also told

L(RS) = L(R)L(S)

So L( (10)* ) is {1} {0} concatenated together as many as times as we want. So L( (10)* ) = {10, 1010, 101010 etc.} but 01 does not belong in L( (10)* ) because when concatenating, the order of the symbols cannot change. The order of the symbols can only change if they are separated with a comma in the set (i.e. union, not concatenation).

So how does L( (0*1*)* ) = the set of all binary strings = L( (0+1)* )?

My understanding is,

L( (0*1*)* ) 
= ( L(0*)L(1*) )* 
= ( {01, 001, 0001, 011, 0011, 000111 etc.} )*
= {01, 01001, 010001, 01011, 010011, 01000111 etc.} 

and as seen, there is no way for the symbols to be lead with 1 or end with 0. So how does it equal the set of all binary strings (I just started learning about this, so I'm trying to figure out where my understanding is incorrect).

3

There are 3 best solutions below

2
On BEST ANSWER

It doesn't.

Your unfolding is not completely correct: 0* generates zero or more zeroes, so 0*1 generates the strings 1, 01, 001 and so forth.

However (0*1)* still doesn't generate all binary strings: It cannot generate any string that ends with 0.

0
On

You can see that the strings consisting of one or more zeros only and the strings consisting of one or more ones only are also in $L(0^*)L(1^*)$ because both $L(0^*)$ and $L(1^*)$ contain the empty string. So $0, 1\in (L(0^*)L(1^*))$ which means that $\{0, 1\}^*\subseteq (L(0^*)L(1^*))^*$.

0
On

Let $R:= (0^*1^*)^*$. We have to show that $R \equiv(0+1)^*$. We will do so by showing that every string $w\in \{0,1\}^*$ belongs to $L(R)$ by induction on the length of $w$.

Evidently, $\varepsilon\in L(R)$. Now, fix $k\geqslant 0$ and suppose that for all strings $u\in\{0,1\}^*$, $|u| = k \implies u \in L(R)$. Let $w\in\{0,1\}^*$ be such that $|w| = k+1$. Then, either $w = 0x$ or $w = 1x$ for some $x \in \{0,1\}^*$ with $|x| = k $. By induction hypothesis, $x \in L(R)$. Now, observe that $$R=(0^*1^*)^* \equiv \varepsilon +0^*1^*(0^*1^*)^* = \varepsilon +0^*1^*R\geqslant 0R + 1R.$$ It immediately follows that $w\in L(R)$. This finishes the proof. $\square$