Designing Context-Free Grammars for Sets of Strings

5.9k Views Asked by At

I'm pretty lost, and would appreciate help or solutions to the following two exercises. I don't really know where to begin or even how to correctly denote a context-free grammar. I have to design CFGs for the following two sets:

1) $\{a^ib^jc^k \mid i \neq j \;or\; j \neq k \}$; i.e. the set of strings containing $a$s followed by $b$s followed by $c$s, s.t. there is a different number of $a$s and $b$s or a different number of $c$s and $b$s, or both.

I think I should have something like $S \rightarrow aAbBcC$, where thereafter $A \rightarrow a | A | \epsilon$, and similar for $b,c$, etc. But I don't know how to ensure different numbers of letters in the string, and more generally, I don't know what I'm doing.

2) The set of all strings of $a$s and $b$s that are not of the form $ww$, that is, not equal to any string repeated.

2

There are 2 best solutions below

4
On BEST ANSWER

That or in the first problem should suggest writing the grammar in two parts, one that generates $L_1=\{a^ib^jc^k:i\ne j\}$ and one that generates $L_2=\{a^ib^jc^k:j\ne k\}$. If $S_1$ is the initial symbol for the first grammar, $S_2$ is the initial symbol for the second grammar, and the two grammars have no non-terminal symbols in common, they can be combined to give a grammar for the desired language simply by adding the production $S\to S_1\mid S_2$.

In generating $L_1$ you can generate any number of $c$’s, but you have to make sure that you don’t generate equal numbers of $a$’s and $b$’s. The $c$’s are no problem: we can have a production $S_1\to XC$, where $C$ will generate any number of $c$’s, and $X$ will deal with that $a$’s and $b$’s. We can let $X$ generate any number of matched pairs of $a$’s and $b$’s with the production $X\to aXb$, and then we can switch either to something that generates any positive number of $a$’s or to something that generates any positive number of $b$’s. For instance, the production $X\to Y$ together with $Y\to aY\mid a$ will take care of the words with more $a$’s than $b$’s.

That’s all of the pieces needed for the first problem; all you have to do is finish putting them together.

Added: The second problem is solved in this PDF: a context-free grammar is given along with a proof that it generates the desired language.

12
On

I think I managed to do the individual grammars correctly, but I'm not sure if I composed them correctly in the end. Also, am I allowed to have e.g. $C \rightarrow C | c | \epsilon$? (I recall reading something about using epsilons in CFGs...) Do I even need to define the grammar with the $G=\{V,\Sigma, P,S\}$ notation? Please let me know what you think.

Answer to Problem 1:

This grammar $L$ is best constructed in two parts: $L_1 = \{a^ib^jc^k|i\neq j\}$, and $L_1 = \{a^ib^jc^k|j\neq k\}$. Let $S_1 \rightarrow L_1$ and $S_2 \rightarrow L_2$. We define that the two grammars do not share any non-terminal symbols, thus we can produce the desired CFG by $S \rightarrow S_1 \mid S_2$. In $L_1$, we can generate an arbitrary number of $c$s. So let us define $S_1 \rightarrow XC$, where $C \rightarrow C \mid c\mid \epsilon$. Then let us define $X \rightarrow aXb$, which will generate any length of matched $a$s and $b$s. To ensure the case $i>j$, we also define $X \rightarrow Y$, and let $Y\rightarrow aY | a$. To ensure the case $j>i$, we define $X \rightarrow Z$, and let $Z \rightarrow Zb | b$.

So we define our CFG $L_1$ as such:
$L_1 = (\{S_1,X,Y,Z,C\},\{a,b,c\},P,S_1)$ with production rules:
$S_1 \rightarrow XC$
$C \rightarrow cC \mid \epsilon$
$X \rightarrow aXb$
$X \rightarrow Y$
$Y \rightarrow aY \mid a$
$X \rightarrow Z$
$Z \rightarrow Zb \mid b$

The case for $L_2$ is similar, except for that we can generate an arbitrary number $a$s instead of $c$s. Our CFG $L_2$ can be defined as:
$L_2 = (\{S_2,A,Q,R,S\},\{a,b,c\},P,S_2)$
$S_2 \rightarrow AQ$
$A \rightarrow Aa \mid \epsilon$
$Q \rightarrow bQc$
$Q \rightarrow R$
$R \rightarrow bR \mid b$
$Q \rightarrow S$
$S \rightarrow Sc \mid c$

So to conclude, our CFG $L$ is defined: $L = (\{S,X,Y,Z,C\},\{a,b,c\},P,S)$,and $S \rightarrow S_1 | S_2$