How does the syntax of propositional logic guarantee unique readability?

107 Views Asked by At

A textbook I'm reading presents the following recursive definition for the wffs of propositional logic:

  1. Any basic propositions is a wff.
  2. If $\alpha$ and $\beta$ are wffs, then so are:
    • $\lnot \alpha$
    • $(\alpha \land \beta)$
    • $(\alpha \lor \beta)$
    • $(\alpha \rightarrow \beta)$
    • $(\alpha \leftrightarrow \beta)$
  1. Nothing else is a wff.

It then briefly mentions that the syntax of propositional logic has the property of "unique readability", in the sense that there is never any uncertainty regarding which connective is the main one, and there is never any certainty regarding the arguments of the main connective (see here).

Since this is an introductory text, it doesn't go into much more detail aside from mentioning that a consequence of unique readability is that every wff can be decomposed in essentially the same way. I understand why this is so, but I'm not sure exactly which aspects of the above definition are responsible for guaranteeing unique readability. Is it just the parentheses?

1

There are 1 best solutions below

0
On

In an informal sense, the parentheses in that formulation guarantee "unique readability" (or unambiguity, as it would be termed in formal language theory).

But that's not a formal description; a phase-structure grammar is either ambiguous or unambiguous as a whole; it's not a composable property. (It's not even a computable property, but that's a whole other question.) So while it's certainly the case that removing the parentheses, or even a single pair of parentheses, would lead to an ambiguous grammar, parentheses are not necessary to create an unambiguous grammar.

It's also worth noting that the syntax of a wff is superficial, although it seems to be quite common to conflate the syntax with the semantics, as in NJJ Smith's account. That's probably suitable for an introductory logic course, and I don't think that it's appropriate to get into a detailed ontological exposition here, but I personally think that it's useful to think about syntax and semantics as two separable concerns.

In formal-language theory, syntax is often described by a formal grammar written in a style called BNF. The syntax presented by Smith for propositional logic basically consists of the following BNF (although I didn't write out the full definition of propositions ($P$) because it's not really relevant). The "non-terminal" $W$ describes the syntax of a wff:

$$\begin{align} S&\to W\\ W&\to\lnot P\\ W&\to(P \land P)\\ W&\to(P \lor P)\\ W&\to(P \implies P)\\ W&\to(P \iff P)\\ P&\to <\text{any uppercase letter, possibly with an integer subscript}>\\ \end{align} $$ When you read that grammar, be aware that all symbols other than capital letters and → are just symbols in the language being defined; capital letters are symbols in the grammar and → indicates a possible rewriting. (Note: I changed $\rightarrow$ and $\leftrightarrow$ in your original for $\implies$ and $\iff$ in order to avoid having to find a way to distinguish between the grammatical arrow symbol and the symbol in the language being defined.)

A formal grammar is just a game with symbols; it has no semantic significance. The above productions are rules for rewriting a sequence of symbols. In the game, you start with the sequence $S$ and then apply some applicable rewriting rule (that is, a rule which allows rewriting a grammar symbol actually in the sequence) until you reach a sequence with no grammar symbols; if you succeed, you can conclude that the resulting sequence (called a "sentence") is in the language being defined. The list of rewrites which ends with some valid sentence is called a derivation of the sentence.

Rules can be applied in any order, and it's pretty easy to prove that the application order doesn't matter. So there can be lots of different derivations for the same sentence and it's useful to define a canonical order. One such canonical order is the simple rule that at each step, we must rewrite the leftmost grammar symbol. That gives rise to what is called a "leftmost derivation", and we say that a grammar is unambiguous if every possible string of language symbols has exactly zero or one leftmost derivation. (A string with no derivation is not in the language, so another way to say that is that every sentence in the language has exactly one leftmost derivation.)

So that's the syntactic view of ambiguity, and it more or less corresponds to what Smith calls unique readability, except that the goal of unique readability is actually to read the sentence; there is some semantic structure being teased out. In his description of unique readability, Smith talks about the "main connective" of a wff. In terms of the above grammar, the "main connective" is the language symbol in the rewrite rule used to rewrite the $W$ corresponding to the wff. After the rewrite, there are two $W$ symbols (unless the main connective was $\lnot$), effectively dividing the original wff into two parts, each of which can be recursively derived.

As I said, there are lots of possible unambiguous grammars which could be used to describe wffs, using different surface representations. For example, we could eliminate the open parentheses from the above grammar, forming the grammar: $$\begin{align} W&\to\lnot P\\ W&\to(P \land P\\ W&\to(P \lor P\\ W&\to(P \implies P\\ W&\to(P \iff P\\ P&\to <\text{any uppercase letter, possibly with an integer subscript}>\\ \end{align} $$ That leads to expressions like $(P \lor Q \iff (R\implies(P\land\lnot Q$. (I'll leave it as an exercise to figure out where the close parentheses go in that expression. But there's only one answer.) Similarly, we could leave the close parentheses and eliminate all the open parentheses. Still unambiguous. But we cannot eliminate all the parentheses without creating an ambiguity, unless we change the syntax a bit. Consider the following parenthesis-free unambiguous grammar: $$\begin{align} W&\to\lnot P\\ W&\to\land P P\\ W&\to\lor P P\\ W&\to\implies P P\\ W&\to\iff P P\\ P&\to <\text{any uppercase letter, possibly with an integer subscript}>\\ \end{align} $$ This notation, which should be called Łukasiewicz notation, but is usually described in English as "Polish notation" because there was obviously only one Polish mathematician, is unambiguous without parentheses, with the advantage that it can be used to write operators which take more than two operands, as long as you know how many operands every operator takes. With this grammar, the above sentence would be written $\iff\lor\; P\; Q \implies R\; \land\; P\; \lnot\; Q$. (A useful feature of Łukasiewicz's notation is that the main connective is always the first symbol.)

Moreover, I could write a grammar which does use parentheses but is still ambiguous. As a dumb example, putting the parentheses around the operator instead of the entire subformula leaves the grammar ambiguous. A more troublesome example is to take an unambiguous grammar and add a new syntax using parentheses in a different way, as in many languages derived from C, in which parentheses are also used both for function calls and for casts, leading to an ambiguity which wouldn't exist if only one of those syntaxes had been added. It's with this in mind that I said at the beginning that unambiguity is not composable: two independently-unambiguous grammars cannot be trivially combined to form a new unambiguous grammar, since ambiguity might result from the interaction.