Push down automata problem

2.2k Views Asked by At

Informally describe the Nondeterministic PDA that generates:

$$\{x\#y\ \mid x,y\in\{a,b\}^{*}\text{and}\space x\ne y\}$$

My initial plan was to use nondeterminism to go through each character before and after the # and try to match them, and if they do not match, then accept. So I would check the first char against the first char after #, etc. However I can't get this idea to work since I can't seem to both record the character position after the # as well as the character I need to match against before the # with the single stack.

Thanks for any help.

2

There are 2 best solutions below

6
On BEST ANSWER

Consider the following languages: $$\newcommand{\A}{\mathtt{a}} \newcommand{\B}{\mathtt{b}} \newcommand{\P}{\mathtt{\#}}\begin{gather} L_\text{long} = \{ x \P y : x , y \in \{ \A , \B \}^* , | x | > | y | \};\\ L_\text{short} = \{ x \P y : x , y \in \{ \A , \B \}^* , | x | < | y | \};\\ L_\text{diff} = \{ x \P y : x , y \in \{ \A , \B \}^* , | x | = | y |, ( \exists k ) ( x_k \neq y_k ) \}. \end{gather}$$ Note that your language $L$ is clearly the union of these three languages, and so a nondeterministic PDA for $L$ can first nondeterministically choose which of these three languages the string belongs to, and then check that the string actually belongs to that language. (Nondeterministic) PDAs for $L_\text{long}$ and $L_\text{short}$ are relatively easy to construct, the real difficulty is with $L_\text{diff}$.

I don't claim that $L_\text{diff}$ is context-free (I actually don't think it is), but there is a context-free language which includes $L_\text{diff}$ as a sub-language, and is itself a sub-language of $L$: let $L^\prime_{\text{diff}}$ consist of all strings $w \in \{ \A , \B , \P \}^*$ such that

  • $\P$ occurs exactly once in $w$; and
  • if $w^\prime$ is the string obtained by removing the single instance of $\P$ in $w$, then $w^\prime \in \{ xy \in \{ \A , \B \}^* : |x| = |y| , x \neq y \}$.

It should be clear that $L_\text{diff}$ is a sub-language of $L_\text{diff}^\prime$, and every string in $L_\text{diff}^\prime$ belongs to $L$.

Suppose that $x = x_1 \cdots x_n$ and $y = y_1 \cdots y_n$ are such that $x \neq y$. This means that there is an $k \leq n$ such that $x_k \neq y_k$: for the moment let us suppose $x_k = \A$ and $y_k = \B$. Then we may rewrite our string as $$x y = x_1 \cdots x_{k-1} \;\A\; x_{k+1} \cdots x_n \; y_1 \cdots y_{k-1} \;\B\; y_{k+1} \cdots y_n$$ Note that the substring $x_{k+1} \cdots x_n \; y_1 \cdots y_{k-1}$ has length $(n-k) + (k - 1 ) = n - 1$ and so let us write it as $$z_1 \cdots z_{k-1} z_k \cdots z_{n-1}$$ Then $$\begin{align} xy &= x_1 \cdots x_{k-1} \;\A\; x_{k+1} \cdots x_n \; y_1 \cdots y_{k-1} \;\B\; y_{k+1} \cdots y_n \\ &= \overbrace{x_1 \cdots x_{k-1}}^{\text{length } k-1}\; \A\;\overbrace{z_1 \cdots z_{k-1}}^{\text{length } k-1} \; \overbrace{z_k \cdots z_{n-1}}^{\text{length }n-k} \;\B\; \overbrace{y_{k+1} \cdots y_n}^{\text{length }n-k} \end{align}$$ So the string $xy$ can be seen as the concatenation of two odd length strings $u , w$ which have different middle symbols. Once this is observed, it is relatively easy to construct a nondeterministic PDA for this language. (First non-deterministically decide whether the first substring has a $\A$ as its middle symbol and the second a $\B$, or vice versa, as you read the string first pop a symbol onto the stack until you non-deterministically guess the middle symbol of the first substring, and then pop those symbols off; once the stake is empty again pop symbols back on and non-deterministically guess where the middle symbol of the second string is.)

To obtain a nondeterministic PDA for $L_\text{diff}^\prime$ also ensure that you find one (and only one) $\P$ in the string before accepting (but reading a $\P$ does not affect the stack in any way).

4
On

This is actually easier than Arthur Fischer is making it. We can make more use of that $\#$ marking the middle of the string. My answer is actually much along the lines of what you said you tried, I think. I guess you might have been forgetting that you're allowed more than one state, which does make it a bit easier to construct the automaton.

Call your language $L$ and let $X = \{a,b\}$. Let $L_1 = \{u\# w \mid u,w\in X^*, |u|\neq |w|\}$ and let $L_2$ be the language of all strings $u\# w$ with $u,w\in X^*$ and $u_i\neq w_i$ for some $i\leq \min\{|u|, |w|\}$. Then your language is the (non-disjoint) union of $L_1$ and $L_2$. (I'm using $|u|$ for the length of $u$.)

Recognising $L_1$ with a pushdown automaton is very easy. To recognise $L_2$, have in addition to the start state $q_0$ five states $p_a$, $p_b$, $q_a$, $q_b$ and $q$, the last being the only accept state. Use a single stack symbol $X$. In $q_0$, we put an $X$ on the stack every time we read a symbol. If we read $\#$ in $q_0$, the automaton should fail, since $q_0$ is for reading the word $u$ in $u\# w$. At any point, we can non-deterministically guess that we have reached the point at which $u_i\neq w_i$. In this case, move to state $p_x$, where $x$ is the symbol just read. At this point we have $i-1$ $X$'s on the stack.

In the state $p_x$, we read without modifying the stack or changing state until we read a $\#$, at which point we move to the state $q_x$. Here, we pop an $X$ off the stack for each symbol read. When the stack is empty, the automaton fails unless the next symbol read is not $x$, in which case it moves to $q$ and accepts.

(Note, the way I have described this shows that $X$ could equally well be any finite alphabet. Also, it's possible to use less states if you prefer.)