CFG with even length and not in the form of SS.

Question

CFG with even length and not in the form of SS.

432 Views Asked by Bumbble Comm At 29 Mar 2026 - 8:21

I know that we can construct a CFG of all strings over $\{0, 1\}$ with even length using the following

$$T \mapsto 0T0 \mid 0T1 \mid 1T0 \mid 1T1 \mid \epsilon$$

But I want the string to have non-equal halves, meaning that they are not in the form $SS$, where $S$ is some string.

How I can modify the above CFG to satisfy this constraint?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The language that you want to prove to be context-free is

$$L = \{uv \in \{0,1\}^* \mid |u| = |v| \text{ and } u \neq v\} $$

where $|w|$ is the length of the string $w$. Actually, it turns out that $L = L'$ where

$$L' = \{u_1 x \, u_2 \, v_1 y \, v_2 \in \{0,1\}^* \mid x,y\in \{0,1\}, \ |u_1| = |u_2|, \ |v_1| = |v_2| \text{ and } x \neq y \} $$

The formal proof that $L = L'$ is below and is quite technical. But intuitively, it is clear that $L = L'$. Indeed, if a string $w$ of even length can be split in two non-identical halves, then some characters from one of the halves can be "lent" to the other in order to decompose $w$ in the way described in the definition of $L'$. And the inverse process is also well defined.

The psychological advantage of considering $L'$ instead of $L$ (although they are the same set) is that the language $L'$ is clearly generated by the following context-free grammar:

\begin{align} S &\ \to \ T_0 T_1 \ \mid \ T_1 T_0 \\ T_0 &\ \to \ 0 \ \mid \ 0 \, T_0 \, 0 \ \mid \ 0 \, T_0 \, 1 \ \mid \ 1 \, T_0 \, 0 \ \mid \ 1 \, T_0 \, 1 \\ T_1 &\ \to \ 1 \ \mid \ 0 \, T_1 \, 0 \ \mid \ 0 \, T_1 \, 1 \ \mid \ 1 \, T_1 \, 0 \ \mid \ 1 \, T_1 \, 1 \end{align}

The intuition is that $T_0$ (the rule in the second line) yields a string of the form $w_1 0 \, w_2$ with $|w_1| = |w_2|$, while $T_1$ (the rule in the third line) yields a string of the form $w_1' 1 \, w_2'$ with $|w_1'| = |w_2'|$. Putting $T_0$ and $T_1$ together (the rule in the first line) yields a string of the form described in the definition of $L'$.

Let us prove that $L = L'$ in a rigorous way.

Proof of $L \subseteq L'$. Let $w \in L$. So, $w = uv$ where $|u| = |v| = n$ for some $n \geq 0$, and $u \neq v$; that is, $u$ and $v$ have the same length and differ at least in their $i^\text{th}$ character, for some position $i$. Therefore, $u = x_1 \dots x_{i-1} x_i \, x_{i+1} \dots x_n$ and $v = y_1 \dots y_{i-1} y_i \, y_{i+1} \dots y_n$ with $x_i \neq y_i$ for some $1 \leq i \leq n$ (all the $x_j$'s and $y_j$'s are in $\{0,1\}$). There are three cases:

either $i-1 = n - i$ (that is, $|x_1 \dots x_{i-1}| = |x_{i+1} \dots x_{n}|$ and $|y_1 \dots y_{i-1}| = |y_{i+1} \dots y_{n}|$); then, we set $u_1 = x_1 \dots x_{i-1}$ and $u_2 = x_{i+1} \dots x_{n}$, and $v_1 = y_1 \dots y_{i-1}$ and $v_2 = y_{i+1} \dots y_{n}$; thus, $w = u_1 x_i u_2 \,v_1 y_i v_2$ with $|u_1| = i-1 = n-1 = |u_2|$, $|v_1| = i-1 = n-i = |v_2|$ and $x_i \neq y_i$; therefore, $w \in L'$;
or $i-1 > n - i$ (that is, $|x_1 \dots x_{i-1}| > |x_{i+1} \dots x_{n}|$ and $|y_1 \dots y_{i-1}| > |y_{i+1} \dots y_{n}|$); then, we set $k = 2i - 1 - n$, and $u_1 = x_1 \dots x_{i-1}$ and $u_2 = x_{i+1} \dots x_{n} \, y_1 \dots y_k$, and $v_1 = y_{k+1} \dots y_{i-1}$ and $v_2 = y_{i+1} \dots y_{n}$; thus, $w = u_1 x_i u_2 \,v_1 y_i v_2$ with $|u_1| = i-1 = n-i + k= |u_2|$, $|v_1| = i - 1- k = n-i = |v_2|$ and $x_i \neq y_i$; therefore, $w \in L'$;
or $i-1 < n - i$ (that is, $|x_1 \dots x_{i-1}| < |x_{i+1} \dots x_{n}|$ and $|y_1 \dots y_{i-1}| < |y_{i+1} \dots y_{n}|$); this case is analogous and dual to the previous one.

Proof of $L' \subseteq L$. Let $w \in L'$. So, $w = u_1 x \, u_2 \, v_1 y \, v_2$ with $x,y \in \{0,1\}$, $|u_1| = |u_2| = n$ for some $n \geq 0$, $|v_1| = |v_2| = m$ for some $m \geq 0$, and $x \neq y$. Therefore, $u_1 = x_1 \dots x_{n}$ and $u_2 = x_{1}' \dots x_n'$ and $v_1 = y_1 \dots y_{m}$ and $v_2 = y_{1}' \dots y_m'$ with $x_i \neq y_i$, where all the $x_j$'s, $x_j'$'s, $y_j$'s and $y_j'$'s are in $\{0,1\}$. There are three cases:

either $n = m$ (that is, $|u_1| = |u_2| = |v_1| = |v_2|$); then, we set $u = u_1 \, x \, u_2$ and $v = v_1 \, y \, v_2$, hence $w = uv$ with $|u| = 2n + 1 = 2m + 1 = |v|$ and $u \neq v$ (because $x \neq y$, where $x$ is the $(n+1)^\text{th}$ character in $u$, and $y$ is the $(n+1)^\text{th}$ character in $v$); therefore, $w \in L$;
or $n > m$ (that is, $|u_1| = |u_2| > |v_1| = |v_2|$); then, we set $k = n-m$ and $u = u_1 \, x \, x_1' \dots x_m'$ and $v = x_{m+1}' \dots x_{n}' v_1 \, y \, v_2$, hence $w = uv$ with $|u| = n + m + 1 = n - m + 2m + 1 = |v|$ and $u \neq v$ (because $x \neq y$, where $x$ is the $(n+1)^\text{th}$ character in $u$, and $y$ is the $(n+1)^\text{th}$ character in $v$); therefore, $w \in L$;
or $n < m$ (that is, $|u_1| = |u_2| < |v_1| = |v_2|$); this case is analogous and dual to the previous one.

CFG with even length and not in the form of SS.

There are 1 best solutions below

Related Questions in DISCRETE-MATHEMATICS

Related Questions in FORMAL-LANGUAGES

Related Questions in AUTOMATA

Related Questions in CONTEXT-FREE-GRAMMAR

Trending Questions

Popular # Hahtags

Popular Questions