On the general relationship between automata, expressions, and grammars

Question

On the general relationship between automata, expressions, and grammars

203 Views Asked by Bumbble Comm At 29 Mar 2026 - 10:13

When I took Theory of Computation, the main points of interest were three kinds of automata: finite, pushdown, and Turing, one type of expression: regular expressions which are equivalent to finite automata, and one type of grammar: context-free grammars which are equivalent to pushdown automata. Although there were brief notes about other types of grammars and automata and what sorts of language they can recognize, the only hierarchy they explored was the $$\text{regular $\leftarrow$ context-free $\leftarrow$ context-sensitive $\leftarrow$ recursively-enumerable}$$ chain, and even at that, context-sensitive was only mentioned.

I always thought this was a sort of ugly and ad hoc structure, so a few decades ago, I sat down and tried to come up with something more fine-grained and more motivated by the nature of the formalism (automata, expressions, and grammars) than by the languages they recognized. I don't recall all the details, but it went something like this:

automata	expression	grammar	language
1-node acyclic	empty	single empty production	null
2-node acyclic	single-letter	single-production, single symbol on right	trivial
linear graph	no operators	single-production	singleton
acyclic graph	alternation only	single-nonterminal on left, no mutual productions	finitary
general graph	add closure	single-nonterminal on, left-restricted mutual productions	regular
re-entrant	add recursive labels	single-nonterminal on left	context-free

Notes

It went beyond this, but I think this is enough to give you the idea.

I'm not sure about all of the equivalences. Like I said, I did this a long time ago.

Automata are directed graphs with the edges labeled by letters with one required START node and at least one STOP node. A linear graph is an acyclic graph where all nodes have at most one in edge and at most one out edge. A re-entrant automaton is a version of the push-down automaton without that ugly explicit stack. It has new node types ENTER and EXIT. An EXIT node jumps back to the most recent ENTER node.

All grammars have at least one production with the start symbol on the left.

Here is an example of a recursive label R in an expression: "ab(R:aRb)ab". The idea is that the expression can either match the empty string or be matched recursively. Note that non-recursive labels add no computational power; they only let you shorten an expression.

Question

I don't think any of this is especially difficult but it was very cool, so I've always wondered if someone somewhere has published a paper or book that explores this hierarchy. I'm primarily interested in the structure of the problem space and the comparison of the different formalisms rather than in the related computability issues.

Can anyone point me in the direction of some papers or books that have similar goals?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2022-05-04 09:12:28

Automata Theory was (at least in the US) much more popular in the 60's and 70's than it is today. There are still applications today in Programming Language Theory (aka "Theory B"). If you look at CS Theory conferences with a Track B and PLV conferences, you may find current literature.

As for textbooks, the first edition of Hopcroft--Ulman and Algebraic Automata Theory (https://www.amazon.com/Algebraic-Automata-Cambridge-Advanced-Mathematics/dp/0521604923) are quite good.

You may also find Symbolic Dynamics to be of interest (e.g., https://home.cs.colorado.edu/~raf/media/papers/one-block.pdf).

Note that from a Complexity Theoretic perspective, there are some well-known equivalences:

$\textsf{REG} = \textsf{DSPACE}(O(1)) = \textsf{NSPACE}(O(1))$
I am not sure of the best upper bound for $\textsf{CFL}$. However, consider the class of decision problems that are logspace reducible to CFLs- call this class $\textsf{LogCFL}$. It is known that $\textsf{LogCFL} = \textsf{SAC}^{1}$, which sits right between $\textsf{NL}$ and $\textsf{AC}^{1}$.
$\textsf{CSL} = \textsf{NSPACE}(O(n))$.

So from a Complexity perspective, folks don't usually look too closely at the grammar structure. Simulating the underlying model is of interest, as this allows us to relate complexity classes.

From a PLV perspective, one programming language grammars are slightly more powerful than CFLs, but not so powerful that our compilers will struggle to efficiently validate the syntax, etc. So from that perspective, it's not surprising that folks don't look much beyond CFGs.

On the general relationship between automata, expressions, and grammars

Notes

Question

There are 1 best solutions below

Related Questions in REFERENCE-REQUEST

Related Questions in COMPUTATIONAL-COMPLEXITY

Related Questions in AUTOMATA

Related Questions in REGULAR-LANGUAGE

Related Questions in CONTEXT-FREE-GRAMMAR

Trending Questions

Popular # Hahtags

Popular Questions