Consider the attached NFA (from Sisper's Introduction to the Theory of Computation, 3e) which has been deduced in order to accept precisely the language containing the string $ab$. The rest of the notation should be pretty standard as far as I know, but please let me know if not.
My question is why do we need the $\varepsilon$ transition (and the corresponding extra state) in this NFA? Why wouldn't the NFA obtained from the one above by deleting this intermediate state be precisely equivalent to the NFA given above (and thus preferable given that it is apparently more immediate)? I suspect I must be missing something and that this simpler NFA is not in fact equivalent somehow.
The relevant definition that I am working with for an NFA to accept a given string $w$:
Let$ N = (Q,\Sigma,\delta,q_0,F)$ be an NFA and $w$ a string over the alphabet $\Sigma$. Then we say that $N$ accepts $w$ if we can write $w$ as $w = y_1y_2 ···y_m$, where each $y_i$ is a member of $\Sigma \cup \varepsilon$ and a sequence of states $r_0,r_1,...,r_m$ exists in $Q$ with three conditions:
- $r_0 = q_0$,
- $r_{i+1} \in \delta(r_i,y_{i+1})$, for $i=0,...,m−1$, and
- $r_m \in F$.
Condition 1 says that the machine starts out in the start state. Condition 2 says that state $r_{i+1}$ is one of the allowable next states when $N$ is in state $r_i$ and reading $y_{i+1}$. Observe that $\delta(r_i, y_{i+1})$ is the set of allowable next states and so we say that $r_{i+1}$ is a member of that set. Finally, condition 3 says that the machine accepts its input if the last state is an accept state.
Given the above, the string $ab$ can be written with $y_1 = a, y_2 = b$ and so my proposed machine would seem to accept this. It would also seem to accept only this string. What am I missing?
The $\epsilon$ edge is unnecessary: the NFA without it is equivalent. I'm not familiar with the text, but I suspect it emerged from a generalized recursive process for generating an NFA for the concatenation of two languages.
Specifically, given an NFA $G_1$ for a language $L_1$ and an NFA $G_2$ for a language $L_2$, we can construct an NFA for the language $L_1L_2 = \{w_1 w_2 | w_1 \in L_1, w_2 \in L_2\}$ by drawing an $\epsilon$-edge from every end state of $L_1$ to every start state of $L_2$.
When doing this construction, it's important to keep $G_1$ and $G_2$ separated: their start states and end states may be used in intermediate ways and a more straightforward merging (as we would want to apply in your simple example) risks creating paths from $G_2$ back to $G_1$.
It's worth noting that there is an algorithm for removing $\epsilon$-edges from an NFA, but it's a bit gross to apply generally: one has to identify all pairs of nodes where there is path from one to the other consisting of just one non-$\epsilon$ character and add an edge between them with that character.