I'm trying to get my parser generator to accept this specification. I know that's kind of a programming question, but I figured this was the best place to ask. It's specified as a Parsing Expression Grammar, I think.
start = Expr
Expr = [0-9]+ / Sum / Product / '(' Expr ')'
Product = Expr '*' Expr
Sum = Expr '+' Expr
Note: I know nothing about formal grammars beyond what I've read on Wikipedia. If there are any good online resources, I'd love to hear about it.
In any case, my generator is throwing this error:
Left recursion detected for rule "Expr".
What I find strange is that it's perfectly willing to accept this:
start = Expr
Expr = [0-9]+ / '(' Expr ')'
Which looks just as recursive as the first one to me.
A "left-recursive" grammar means that the parser can get into a loop in the parsing rules without making any progress consuming the input. Imagine that each production is a subroutine that might eat some tokens or call some other subroutines. For example, consider this production:
This means that the
exprparser will first call thenumberparser (which might consume some tokens), then try to eat a+token (and fail if the next token is not of that type) and then call thenumberparser again to consume more tokens. If that is all successful, theexprparser will yield success.Now consider this left-recursive production:
The first thing the
exprparser will do is to call itself recursively, putting itself into an unproductive infinite loop.In contrast, consider this production, which is not left-recursive:
The first thing the parser will do is try to consume an open parenthesis token, and it will call itself recursively only if that is successful. It can never get into an infinite loop doing this because there are only so many parentheses and it has to eat one each time before it can recurse; eventually it must run out of parentheses and then the recursion will stop. Your
Expr = [0-9]+ / '(' Expr ')'example is similarly safe.Here is a portion of the left-recursive grammar you showed:
If the next token is not
[0-9]+, then theexprparser will callsum, which will immediately callexpragain, and the parser will be in an infinite loop.Usually you can rewrite rules like
sumin a non-left-recursive form. In this case it will look something like this:This is not left-recursive because the
more_termsparser must consume a*or a+token before it can call itself recursively; it is impossible for the parser to get into an infinite loop by doing this. Similarly theexprparser callsterm, buttermwill not call back toexprwithout consuming a parenthesis.In general the transformation says to turn a left-recursive grammar like this:
into a non-left-recursive one like this:
I don't know if there are any good online resources (I suppose there must be) but if you like Perl, chapter 8 of Higher-Order Perl is available online and discusses precisely this point.
Also, I hope it is not insulting to suggest that you read the Wikipedia article about left recursion, and perhaps the article about recursive descent parsing, which is the name for the technique your parser generator is trying to use.