Regularity of the language

236 Views Asked by At

$L= \{ xy \mid x,y \in \{a,b\}^* \}$ is not a regular language. But would it make a difference if we added another constraint that $|x|=|y|$. Would this enforce the condition of finiteness on the language?

4

There are 4 best solutions below

4
On

Hint: start by asking yourself which strings can be broken down into $xy$ with $|x|=|y|$ and which ones can't. For instance, which of the following strings do you think belong to the language $\{xy \mid x,y \in \{a,b\}^*, |x|=|y|\}$?

$b$, $aa$, $ab$, $ababab$, $aaabbbb$?

Once you can see a simpler way to understand this language, it should become very obvious how to build a finite state machine which recognizes it.

0
On

I can see that you’ve been having some trouble with this material, so I’m going to discuss the two languages that you mention in some detail.

By definition the strings in $\{a,b\}^*$ are all finite; that’s something that you never have to worry about.

Now let’s look at the two languages that you mentioned, $$L_1=\{xy:x,y\in\{a,b\}^*\}$$ and $$L_2=\{xy:x,y\in\{a,b\}^*\text{ and }|x|=|y|\}\;.$$

If $w$ is any word in $\{a,b\}^*$, we can write $w=w\epsilon$, where $w,\epsilon\in\{a,b\}^*$, so $\{a,b\}^*\subseteq L_1$. On the other hand, it’s obvious that $L_1\subseteq\{a,b\}^*$, so $L_1=\{a,b\}^*$. That is, $L_1$ is just the set of all finite strings of $a$’s and $b$’s, including the empty string. This certainly is a regular language: it’s recognized by the DFA that has just one state, which is an acceptor state, and transitions from that state to itself for both $a$ and $b$. Alternatively, it’s generated by the regular grammar whose productions are

$$S\to aS\mid bS\mid\epsilon\;.$$

Now $L_2$ is definitely not all of $\{a,b\}^*$, because $a\notin L_2$. To see this, note that the only ways to write $a$ in the form $xy$ for $x,y\in\{a,b\}^*$ are with $x=a$ and $y=\epsilon$, or with $x=\epsilon$ and $y=a$. But $|a|=1\ne 0=|\epsilon|$, so neither of these decompositions satisfies the defining condition for members of $L_2$. On the other hand, $aa\in L_2$, because we can write $aa$ as $xy$ with $x=a$ and $y=a$.

What words $w\in\{a,b\}^*$ can be written in the form $xy$ with $|x|=|y|$? If $w=xy$ is such a word, then $|w|=|x|+|y|=2|x|$, so $|w|$ must be an even number. In other words, we now know that every $w\in L_2$ has even length. This automatically rules out words of odd length, like $a$, $bba$, $abbab$. The next question that you should ask is whether anything else is ruled out: are there words of even length that are not in $L_2$?

Suppose that $w\in\{a,b\}^*$ and $|w|$ is even; then $|w|=2n$ for some non-negative integer $n$. If $n=0$, then $w=\epsilon$; is this in $L_2$? Yes, because $\epsilon=\epsilon\epsilon$, and $|\epsilon|=|\epsilon|$. Now suppose that $n>0$. Then we can write out the word $w$ as $c_1c_2\dots c_nc_{n+1}\dots c_{2n}$, where each $c_k$ is either $a$ or $b$. Is it possible to write $w$ in the form $xy$, where $x,y\in\{a,b\}^*$ and $|x|=|y|$? Sure: just make $x$ the first $n$ letters of $w$ and $y$ the last $n$ letters. To be precise, let $x=c_1c_2\dots c_n$ and $y=c_{n+1}c_{n+2}\dots c_{2n}$. Therefore $w\in L_2$, and we’ve shown that every $w\in\{a,b\}^*$ of even length belongs to $L_2$.

Putting the pieces together, we see that a word in $\{a,b\}^*$ belongs to $L_2$ if and only if its length is even: $L_2=\{w\in\{a,b\}^*:|w|\text{ is even}\}$. Once you realize this, it’s easy to design a DFA that recognizes $L_2$ or a regular grammar that generates $L_2$. You can do it with a two-state DFA, or with a regular grammar with two non-terminal symbols and five productions.

0
On

$L={xy∣x,y∈{a,b}}^∗}$ is the same as $∑^*$ so it has all the strings! and you can find a DFA that accepts all the strings (it has one state, which is both the initial state and the final state)so it is regular. {xy∣x,y∈{a,b}∗,|x|=|y|} is the same as {w| |w| is even} which is a regular language (you can find a DFA with 2 states for it)

0
On

I'll make another stab at clarifying this, unpacking your question.

$L=\{xy\ |\ x, y \in \{a, b\}^*\}$ is not a regular language.

This is false. This language consists of strings of the form $xy$, where $x\text{ and }y$ can be any finite strings (including the empty string) consisting of $a$s and $b$s. So, because $x$ or $y$ can be the empty string, it simply consists of all strings over $\{a, b\}$, namely the language denoted by the regular expression $(\mathtt{a}+\mathtt{b})^*$, which is of course regular, by definition. Or, if you prefer, there's Brian's answer: a FA for this has one state, which is both the start and final state and has a transition function that on $a$ or $b$ goes to itself.

Would $L=\{xy\ |\ x, y \in \{a, b\}^*\text{ and } |x|=|y|\}$ be a regular language?

Yup. Since it's made up of the concatenation of any strings over $\{a, b\}$ which are of the same length, this language is just all strings of even length over that alphabet. A FA for it would have two states, a start state (which is also the final state) and another one, with $a, b$ transitions between both. Or, if you'd prefer regular expressions, it's the language denoted by $(\mathtt{aa}+\mathtt{ab}+\mathtt{ba}+\mathtt{bb})^*$.

(In your comment) Does $|x|=|y|$ state that the length is finite?

Not immediately. But if we back away from this, the definition of $\Sigma^*$ for a set of characters $\Sigma$ is the set of all finite strings using characters from $\Sigma$. We use this definition because allowing infinitely long words in a language makes an awful lot of the nice results about languages invalid. In other words, to keep us paying in a friendly sandbox, we require that a language is a (possibly infinite) set of finite strings over a finite alphabet.