Number of states required to recognize $\{ ss : s \in \{ 0 , 1 \}^*, |s| = i \}$ and its complement

237 Views Asked by At

$$\Sigma = \{0,1\}\;\\ S_{i} = \left\{ss: s\in {\Sigma}^{*} \text{and $s$ has length $i$}\right\}$$

  1. Prove that for any $i$, any DFA recognizing $S_{i}$ must have $2^{i}$ or more states.
  2. Design a NFA using fewer states to recognize the complement of $S_{i}$

So far I am not sure how to approach this problem. I haven't even come up with a $2^{i}$ state DFA that can recognize the language. Does anyone know how to do this problem? I would really appreciate if you can tell me your solution.

2

There are 2 best solutions below

12
On BEST ANSWER

You’ll not be able to show that there is a DFA with $2^i$ states that recognizes $S_i$: it’s not hard to show that a DFA that recognizes $S^0=\{\epsilon\}$ must have a minimum of $2$ states, not $2^0=1$, and a DFA that recognizes $S_1=\{00,11\}$ requires at least $5$ states, not $2^1=2$.

The first half of the problem must hinge on the fact that there are $2^i$ words of length $i$ in $\Sigma^*$. Suppose that $M$ is a DFA with fewer than $2^i$ states. Then by the pigeonhole principle there are a state $q$ of $M$ and words $u,v\in\Sigma^*$, each of length $i$, such that $u\ne v$, but $M$ is in state $q$ after reading either $u$ or $v$. Since $M$ recognizes $uu$, it must also recognize $vu$, which it should not. Thus, $M$ must have at least $2^i$ states.

Added: For the NFA that Marko suggested in the comments you have an initial state $q_0$, states $q_{k,\ell}$ for $k,\ell\in\{1,\dots,2i\}$, and a state $q_f$. You have the following transitions.

$$\begin{align*} q_0&\overset{0}\longrightarrow q_{1,1}\\ q_0&\overset{1}\longrightarrow q_{1+i,1}\\ q_{1,i}&\overset{1}\longrightarrow q_{1,1+i}\\ q_{1+i,i}&\overset{0}\longrightarrow q_{1+i,1+i}\\ \end{align*}$$

For each $k\in\{2,\ldots,i\}$:

$$\begin{align*} q_0&\overset{0,1}\longrightarrow q_{k,1}\\ q_0&\overset{0,1}\longrightarrow q_{k+i,1}\\ q_{k,k-1}&\overset{0}\longrightarrow q_{k,k}\\ q_{k+i,k-1}&\overset{1}\longrightarrow q_{k+i,k}\\ q_{k,k+i-1}&\overset{1}\longrightarrow q_{k,k+i}\\ q_{k+i,k+i-1}&\overset{0}\longrightarrow q_{k+i,k+i} \end{align*}$$

For each $k\in\{1,\ldots,2i\}$: $$q_{k,2i}\overset{0,1}\longrightarrow q_f$$

For each $\langle k\in\{1,\ldots,2i\}$ and $\ell\in\{1,\ldots,2i-1\}$ for which the transition from $q_{k,\ell}$ to $q_{k,\ell+1}$ has not yet been defined: $$q_{k,\ell}\overset{0,1}\longrightarrow q_{k,\ell+1}$$

And finally, $$q_f\overset{0,1}\longrightarrow q_f$$

Every one of the $4i^2+2$ states is an acceptor state. Think of $q_{k,\ell}$ as being in row $k$ and column $\ell$ of a $2i\times 2i$ array of states, with $q_0$ to the left and $q_f$ to the right. The only paths through the machine are from $q_0$ along one row to $q_f$. If I managed to keep my subscripts straight, if $k\in\{1,\ldots,i\}$, a word gets through the path along row $k$ to $q_f$ iff it has $0$ in position $k$ and a $1$ in position $k+i$, and it gets through the path along row $k+i$ to $q_f$ iff it has a $1$ in position $k$ and a $0$ in position $k+i$.

You may have to give it some thought, but this NFA accepts every word of length less than $2i$: even if it’s of the form $uv$, where $|u|=i$, and $v$ is a proper initial segment of $v$, it will be accepted on row $i$ or row $2i$, depending on whether its $i$-th symbol is $0$ or $1$, respectively. Thus, the NFA correctly handles every word of length at most $2i$. However, it doesn’t quite do what we want: it rejects words of the form $uv$ where $u\in S_i$ and $|v|>0$.

You can repair this by adding states $q_k$ for $k\in\{1,\ldots,2i\}$, with transitions $q_k\overset{0,1}\longrightarrow q_{k+1}$ for $k\in\{0,\ldots,2i-1\}$ and $q_{2i}\overset{0,1}\longrightarrow q_f$. This gives you a path of length $2i+1$ from $q_0$ to $q_f$ that accepts any word of length greater than $2i$.

The total number of states is now $4i^2+2i+2$, but that’s still a polynomial in $i$, and one of low degree at that, so it’s less than $2^i$ for sufficiently large $i$. (If my hasty calculations are correct, $i\ge 9$ is large enough.)

0
On

The minimal (incomplete) DFA for $S_i$ has $3 \times 2^i -2$ states (one more for a complete automaton). The first values are $1, 4, 10, 22, 46$. This can be shown using a small combinatorial argument on the number of distinct suffixes of the words of length $i$ (let me know if you need more details on this part).

There is a NFA with $2 + 7 \frac{i(i+1)}{2}$ states recognizing the complement of $S_i$. The construction, a slight improvement over the previous answers, is depicted here for $i = 3$ (44 states) and the general construction should be transparent from this example. I do not know whether this bound is optimal.