Understanding Regular Expression (0 U 1)*

6.8k Views Asked by At

This question concerns regular expressions as they appear within Automata Theory.

I am experiencing confusion with some of the notation, relating to the union operation and the way it interacts with the Kleene closure.

More specifically I am confused about what it means when we say (0 U 1)*

Can this be interpreted as equivalent to (0* U 1*) ?

I believe both of these expressions can be expressed in English as "the set of all strings of any length consisting exclusively of 0s or 1s, including the empty string".

Or, in set notation {epsilon, 0, 00, 000, 1, 11, 111...}

I would greatly appreciate someone helping me clarify this!!

2

There are 2 best solutions below

0
On

Welcome to MSE! Here $(0 \cup 1)^*$, often written as $(0 | 1)^*$, is actually the set of all bitstrings. Let's break it down. Inside of the parentheses is the expression $0 | 1$, meaning anything that that is a zero or a one is allowed (i.e. everything is allowed). The asterisk for Kleene closure tells you that you can repeat it as many times as you wish. So in a sentence, $(0 | 1)^*$ represents the set of strings where each bit is a zero or one, and you can have as many such bits as possible. Hopefully it's now clear why this is the set of all bitstrings.

On the other hand, $(0^* \cup 1^*) = (0^* | 1^*)$ is far from all bitstrings. This expression says that the bitstrings must be either of the form $0^*$ (which is any bitstring with only zeroes) or of the form $1^*$ (which is any bitstring with only ones). This expression does not accept bitstrings with both a zero and a one, and as such is not the same as the $(0 | 1)^*$.

2
On

The two regular sets are quite different. $0\cup 1$ specifies the set $\{0,1\}$, the set of words that are equal either to $0$ or to $1$. Taking the star of this set gives us all words that can be formed by concatenating words in $\{0,1\}$, so it gives us all possible finite strings of zeroes and ones: it’s the set

$$\{\epsilon,0,1,00,01,10,11,000,001,010,011,100,101,110,111,\ldots\}\,.$$

The expression $0$ specifies the set $\{0\}$, so $0^*$ specifies the set of all possible finite strings of zeroes: $\{\epsilon,0,00,000,\ldots\}$. Similarly, $1^*$ specifies the set of all possible finite strings of ones: $\{\epsilon,1,11,111,\ldots\}$. The $\cup$ operator specifies indicates that we’re to take the union of these two sets: $0^*\cup 1^*$ specifies the set

$$\{\epsilon,0,1,00,11,000,111,0000,1111,\ldots\}\,.$$

Any finite string of zeroes and ones that contains at least one $0$ and at least one $1$ is in the first set but not the second.

By the way, you may at some point encounter other notations: $0\cup 1$ can also be written $0+1$, $0\mid 1$, or $0\lor 1$, and the empty string is sometimes called $\lambda$ rather than $\epsilon$.