NFA (nondeterministic finite automaton) made out of the Bible?

155 Views Asked by At

Let B be the language over alphabet {a, ... z} consisting of those words occuring in the Bible. Thus, B = {in,the,beginning,god,created,...}.

Describe an NFA whose language is B. Describe what happens when you apply subset construction to this NFA to get a DFA, ignoring unreachable states. Roughly, how do the number of states in each compare? How does each number compare to the # of distinct words in the Bible?


Of course, this is homework so I'm not asking for answers so much as a bit of a shove in the right direction. I'm pretty thoroughly stumped here.

I'm picturing the NFA as a series of letters that go down an ordered path in the same order that the letters show up in the Bible. Which not only seems really impractical but a little too stupid of an answer for it to be correct. What would an NFA of this even look like? Would it really be huge?

Thanks for any help.

1

There are 1 best solutions below

1
On

According to this page, there are 788,258 words in the Bible but only 14,565 distinct words. Now, it is shown in [1] that a full Scrabble lexicon with 94,240 entries can be represented be a DFA with 19,853 states. So you can expect the DFA of the Bible to have roughly $19,853 \times \frac{14,565}{94,240} \approx 3068$ states.

[1] A. W. Appel et G. J. Jacobson, The world fastest scrabble program, Commun. ACM 31 (1988), 572–585.