I'm confused about the following DFA problem:
Let L denote the set of all strings in $\{a, b\}^∗$ that contain abb or aab as a substring. Show that any DFA that decides L must have at least five states.
I think L can be decided in a 4 state DFA as follows:
q0 (start state):
a -> q1
b -> q0
q1:
a,b -> q2
q2:
b -> q3
a -> q0
q3 (accept state)
a,b -> q3
Note: q3 is the only accept state Why does this DFA not decide L?
As Rolf Hoyer noted in the comments, your DFA does not accept $aaab$, which is in $L$. The problem is that after you’ve read $a$, it really does make a different whether the next input is an $a$ or a $b$: if it’s an $a$, you can read any number of $a$s and still be two-thirds of the way to having the substring $aab$, but if it’s a $b$, then another $a$ puts you back at only one-third of the way to an acceptable substring.
The easiest argument to show that at least five states are needed uses the Myhill-Nerode theorem: show that the strings $a$, $b$, $aa$, $ab$, and $aab$ all have distinguishing extensions and therefore must correspond to different states.