Proving that a particular CFG grammar generates language L

102 Views Asked by At

$$L=\left\{w\in\{x,y\}^*\mid N(y)=2N(x)\right\}$$

and $G$ is

$$S\to yySx\mid ySxSy\mid xSyy\mid SS\mid\epsilon$$

I need to proof that the grammar generates this language. I know that I need to use induction to prove it, but I can't follow through

so, could you show how to prove it?

1

There are 1 best solutions below

0
On BEST ANSWER

There are two parts to this: you have to show that every word generated by $G$ is in $L$, and you have to show that every word in $L$ is generated by $G$. The first of these is straightforward. Suppose that $w$ is generated by $G$. Then it has a derivation

$$S=w_0\Rightarrow w_1\Rightarrow w_2\Rightarrow \ldots\Rightarrow w_{n-1}\Rightarrow w_n=w\,.$$

Clearly $|S|_x=|S|_y=0$, so $|S|_y=2|S|_x$. Now suppose that $0\le k<n$ and that $|w_k|_y=2|w_k|_x$, and verify that $|w_{k+1}|_y=2|w_{k+1}|_x$ no matter which production of $G$ is used to derive $w_{k+1}$ from $w_k$. Once you’ve done that, you can conclude by induction that $|w|_y=|w_n|_y=2|w_n|_x=2|w|_x$ and hence that $w\in L$.

The second part is a bit harder, but it can be done by induction on $|w|$. If $w\in L$ has length $0$, then $w=\epsilon$, which is generated by $G$. Now for the induction step let $n\in\Bbb Z^+$, suppose that every word of $L$ of length less than $n$ is generated by $G$, and let $w\in L$ be of length $n$; you need to show that $w$ is generated by $G$.

We’ll need to break the argument into cases, and we’ll use a technique that is useful but not obvious in order to do so. Say $w=u_1u_2\ldots u_n$, where each $u_k$ is either $x$ or $y$. Let $c_0=0$, and for $k=0,\ldots,n-1$ let

$$c_{k+1}=\begin{cases} c_k+1,&\text{if }u_k=y\\ c_k-2,&\text{if }u_k=x\,. \end{cases}$$

This is less complicated than it looks. Imagine reading $w$ from left to right, starting a counter at $0$, increasing it by $1$ each time you read a $y$, and decreasing it by $2$ each time you read an $x$; $c_k$ is the value of the counter after you’ve read $k$ letters of $w$. Because $w\in L$, it has twice as many $y$s as $x$s, so the counter must reach $0$ at the end of the word: $c_n=0$.

  • Suppose first that $c_k=0$ for some $k$ with $0<k<n$. Let $w_1=u_1\ldots u_k$ and $w_2=u_{k+1}\ldots u_n$. Show that $w_1,w_2\in L$. Clearly $|w_1|,|w_2|<n$, so the induction hypothesis then ensures that $w_1$ and $w_2$ have derivations in $G$. Show that $w$ has a derivation in $G$ that starts $S\Rightarrow SS$.

Now suppose that $c_k\ne 0$ for $1\le k\le n-1$.

  • If $u_1=x$, then $c_1=-2$. Suppose that $c_k>0$ for some $k>1$. Then there is a smallest $k$ such that $c_k>0$. When the counter increases, it always increases by $1$, so it must be that $c_k=1$ and $c_{k-1}=0$. But this is impossible (why?), so it must be that $c_k<0$ for $1\le k\le n-1$. Since $c_n=0$, this implies that $c_{n-1}=-1$ and $c_{n-2}=-2$ (why?) and hence that $u_{n-1}=u_n=y$. Thus, $w=xw_1yy$, where $w_1=u_2\ldots u_{n-2}$. Clearly $|w_1|<n$, so $w_1\in L$, and $w_1$ has a derivation in $G$. Show how to modify this derivation to get a derivation of $w$ in $G$.

Now suppose that $c_k\ne 0$ for $1\le k\le n-1$, and $u_1=y$, so that $c_1=1$. There are now two possibilities.

  • Suppose that $u_n=x$, so that $c_{n-2}=2$. Argue as in the previous bullet point to show that if some $c_k<0$, then there must be an $\ell$ such that $k<\ell<n-1$ and $c_\ell=0$ and that this is impossible. Conclude that $c_k>0$ for $1\le k\le n-1$. In particular, $c_2>0$, so $u_2=y$ (why?). Thus, $w=yyw_1x$, where $w_1=u_3\ldots u_{n-1}$. Show that $w_1$ has a derivation in $G$, and show how to modify this derivation to get a derivation of $w$ in $G$.
  • Finally, suppose that $u_n=y$. Then $c_1=1$, and $c_{n-1}=-1$ (why?), but there is no $k$ such that $1<k<n-1$ and $c_k=0$. This means that there must be a $k$ such that $c_k=1$ and $c_{k+1}=-1$; why? Clearly this means that $u_{k+1}=x$. Let $w_1=u_2\ldots u_k$ and $w_2=u_{k+2}\ldots u_{n-1}$. Show that $w_1,w_2\in L$; clearly $|w_1|,|w_2|<n$, so $w_1$ and $w_2$ have derivations in $G$. Show how to modify those derivations to get a derivation of $w$ in $G$.

Note that each of the four cases is based on one of the non-terminal productions in $G$.