Software/algorithm for the smallest context free grammar describing a set of words?

410 Views Asked by At

I am looking for software/algorithm for the smallest context free grammar describing a finite set of words (and no other words).

For a single word I found sequitur

Related to this seems: given a CFG what is the shortest functionally equivalent CFG?

I would prefer the size to be measured by the CNF (for a precise estimate of the size).

To clarify I know the set can be described by a regular expression, this question is about the smallest CFG.

1

There are 1 best solutions below

1
On

Any finite set of words is recognized by a finite state automaton (i.e., a regular expression) and therefore does not need the expressive power of a CFG. You can build a deterministic or non-deterministic automaton and minimize it. To do this programmatically, use a Lex tool from the Lex/Yacc tool family. The size of the automaton (typically the number of states is used) is a measure for the complexity of the set.