Logic done with syntax trees instead of sequences of characters?

282 Views Asked by At

Last year, I took a course in logic. In it, sentences were defined inductively as sequences of characters in a certain way. I am now reading Shoenfield as an attempt to look deeper into the subject, and he too defines sentences as sequences of characters.

The first theorem he proves is effectively equivalent to: any sentence can be uniquely parsed into a syntax tree.

That said, why not forego the definition of sentences as sequences, and use syntax trees from the get-go? Is there any benefit to using sequences, other than that being how we are used to writing propositions? Does anyone out there forego sequences entirely?

2

There are 2 best solutions below

4
On

There are indeed situations where syntax trees are more easy to handle than sentences-as-strings - every induction-on-complexity argument provides such an example. However, the converse also holds sometimes - e.g. it's a bit easier to linearly order the set of sequences, and we often want an enumeration of the expressions in our language.

Ultimately, though, the translation between the two is elementarily enough that their respective technical advantages/disadvantages quickly stop mattering. In light of this, we primarily use sequences since they better match what we actually think of as "sentences." One aspect of this is that I think sentences are generally easier to parse than syntax trees. They're also easier to write on a page - trees take up a lot more space.

In summary, the mathematical differences between the two aren't obviously substantial enough to provide strong motivation in either direction, and sentences seem to have a nontrivial "human advantage" over trees.

0
On

In my experience, which is more programming language theory (PLT) and type theory (and generally CS-y) than traditional logic, we virtually never work in terms of sequences of characters. I, personally, view logic texts that put a lot of focus on this as using their time/space poorly. This made a bit more sense before computers were widely used. It's certainly important to know that you can go between sequences of characters and trees in a computational way, but it's not a hard to believe fact, and the details of it are almost always irrelevant. As long as you know what tree is represented by the text in the book, that's generally all you need to know.

I'm not aware of any introductions that actually draw the trees as a matter of course as that would be quite verbose and typographically awkward.

What PLT/etc. people usually do is write pseudo-CFGs (context free grammars) such as $$\varphi,\psi ::= A \mid \varphi\supset\psi \mid \bot$$ and then go on to write things like $((P\supset\bot)\supset\bot)\supset P$ with only, maybe, some informal comments about precedence, grouping with parentheses, and the syntax of atomic formulas.

In formal (i.e. machine-checked) expositions or other code operating on logic, you'll see data types like:

data IPC⟨→⟩ : Set where
    ⋆_  : V → IPC⟨→⟩
    _⊃_ : IPC⟨→⟩ → IPC⟨→⟩ → IPC⟨→⟩
    ⊥c  : IPC⟨→⟩

which describes the syntax tree of IPL but not really the concrete syntax. (Of course, the definition of Agda, in this case, specifies the syntax of code that uses these constructors.) You can't do an induction on the sequence of characters making up the formulas as there isn't one. If you wanted to do such a thing, you'd have to define a function from this type into the type of sequences of characters (and prove theorems that such an encoding is faithful). If you read the full article from which I got the above data type, it never does this nor needs to.

This isn't to say that the field of formal languages, i.e. parsing theory and context free grammars etc., isn't interesting. It's just not logic, though they often share techniques.