A CFG G is given with the following productions where S is the start symbol, A is a non-terminal and a and b are terminals. $$S → aS \mid A \\ A → aAb \mid bAa \mid \epsilon$$
Which of the following strings is generated by the grammar above?
- aabbaba
- aabaaba
- abababb
- aabbaab
AFAIK :
Option $(4)$ is true.
$S→aS→aA→aaAb→aabAab→aabbAaab→aabbaab$.
Can you explain please, what is the language of this grammar?
I'd tried for :
- More or equal number of a's than b's : False
- Starting and ending with same symbol : False
- Starting with a and more number of a's than b's : False
A derivation may start by using the production $S\to aS$ any number of times, but eventually it must use the production $S\to A$, since the only terminal production is $A\to\epsilon$. Say it uses $S\to aS$ $n$ times, where $n\ge 0$, and then $S\to A$; at that point we have $a^nA$.
Now consider what kinds of strings can be produced starting with $A$. The only available non-terminal productions are $A\to aAb$ and $A\to bAa$, each of which produces one $a$ and one $b$. Suppose that we apply these productions a total of $m$ times; then at that point we have
$$a^nx_1x_2\ldots x_{m-1}x_mA\overline{x}_m\overline{x}_{m-1}\ldots\overline{x}_2\overline{x}_1\;,$$
where each $x_k$ is either $a$ or $b$, and
$$\overline{x}_k=\begin{cases} b,&\text{if }x_k=a\\ a,&\text{if }x_k=b\;. \end{cases}$$
If we now apply $A\to\epsilon$, we end up with something of the form
$$a^nx_1x_2\ldots x_{m-1}x_m\overline{x}_m\overline{x}_{m-1}\ldots\overline{x}_2\overline{x}_1\;.$$
For any word $w\in\{a,b\}^*$ let $\widehat{w}$ be the word obtained by reversing the order of $w$, replacing each $a$ by a $b$, and replacing each $b$ by an $a$. For instance, if $w=abb$, then $\widehat{w}=aab$. Then the language of this grammar is the set of all words of the form $a^nw\widehat w$, where $n\ge 0$, and $w\in\{a,b\}^*$:
$$L(G)=\big\{a^nw\widehat w:n\ge 0\text{ and }w\in\{a,b\}^*\big\}\;.$$
Now just test your four strings against this description; you’ll find that only one of them fits. For instance, $aabbaba$ doesn’t fit. To see this, observe that the $w\widehat w$ part must have even length, so it must be $abbaba$, $baba$, $ba$, or $\epsilon$. There can be no $b$s before it, so it must be $abbaba$. But then $w=abb$, so $\widehat w=aab\ne aba$, and $abbaba$ doesn’t have the right form.