Free variables, parameters, dummy names -- what policy to choose?

354 Views Asked by At

A long lead-up (sorry!) to a short question. Start with a semi-formal argument in Loglish:

(1) $\quad$ for every $x$, $x$ is $F$

(2) $\quad$ for every $x$, ($x$ is $F$ $\to$ $x$ is $G$)

Arbitrarily choose some object in the domain, a: then we have

(3) $\quad a$ is $F$

(4) $\quad a$ is $F$ $\to$ $a$ is $G$

(5) $\quad a$ is $G$

But a was arbitrarily chosen; so

(6) $\quad$ for every $x$, $x$ is $G$

Obviously, something we have to explain to beginners is: What is '$a$' doing here? This plainly isn’t an ordinary proper name, attached once-and-for all to a determinately given individual. On the other hand, it isn’t an anaphoric pronoun either. It is a sort of ‘dummy name’, or ‘temporary name’, or ‘ambiguous name’, or ‘arbitrary name’ – perhaps none of those labels is entirely happy, though all are in use. Let's say it is a ‘parameter’.

When we go fully formal wanting to regiment such arguments into an official first-order language, we need some symbols to play the role of parameters, as informally played by '$a$'. There seem to be three policies on the market, used by different textbooks:

(A) We can use the same symbols both as variables-tied-to-quantifiers and as stand-alone parameters (in conventional jargon, the same symbols can appear as both ‘bound variables’ and ‘free variables’).

(B) We can use the same symbols both as proper names and as parameters/dummy names (with just some of these symbols getting a fixed denotation in a given language).

(C) We can use distinctive symbols for proper names, for bound variables, and for parameters/dummy names.

Each policy can be made to work. Policy (A) has historically been the most common one among logicians. And it does conform best to the practice of mathematicians who casually move between using letters as dummy names (‘Suppose $m$ is the number of positive roots . . . ’) and using letters of the same kind as quantified variables (‘For any integer $n, (n + 1)(n − 1) = n^2 − 1$’), letting context make it clear what the symbols are doing. However, when we go formal, overloading symbols like this does cause complications: we have to spell out very careful rules for their double use if we are to avoid getting into tangles.

Policy (B) is easier to work with, but hasn’t anything special to recommend it. Once we have distinguished, at the semantic level, the fixed-denotation names built into a language from its further supply of dummy names, why not highlight the distinction by using syntactically different symbols for the two styles of name?

Policy (C), then, may be less economical but it does make it easier to keep track of what is going on. Variables only appear bound, parameters only appear free. It conforms to the good Fregean principle of marking important differences of semantic role by using different styles of symbols. It has distinguished antecedents, e.g. in Gerhard Gentzen. It's the policy in e.g. Richmond Thomason's impressive textbook.

Which is the easiest policy for the beginning student to understand? I'd say (C). But -- and here at last is the question --

Is there any strong reason -- issues of ease of understanding apart -- for preferring the more common policy (A) to the less common policy (C)?

3

There are 3 best solutions below

0
On

The only thing I can think of is that by adopting policy (A), which makes it ok to use free variables in the statements, we can use it more naturally as a system that can prove properties and relations of formulas in general. For example, you can prove that $P(x)$ and $\neg \neg P(x)$ are equivalent wff's.

But you obviously you can only do those kinds of things if you have to be really careful about how the variables are used, i.e. whether it's one of those 'temporary' variables or not. And certainly for beginning students we just work with sentences, rather than formulas in general. So, policy (C) has a lot be said for it, I agree.

0
On

tl;dr The distinction between parameters and variables already exists in the form of the distinction between constant symbols and variables, and this distinction is commonly made notationally too. The issue, in my opinion, is that we don't fully leverage these distinctions, because we push signatures and variable contexts to the background which (again, in my opinion) is a big mistake. This leads to overburdening the variable concept.

First, I would recommend not defining terms and formulas but instead defining terms-in-context and formulas-in-context. I'll write $\Delta\vdash t$ and $\Delta\vdash\varphi$ where $\Delta$ is a finite set of variables, $t$ is a term and $\varphi$ a formula whose free variables are contained in $\Delta$1. Normally, we globally fix a set of variables (often this is left largely implicit), and then we define terms and formulas inductively. This is all well and good, and we can use structural induction to prove facts about all formulas/terms. However, we often don't want to prove things about all formulas but instead about closed formulas. Unfortunately, closed formulas are not an inductively defined subset of all formulas. This is manifested when we attempt to do induction over the set of closed formulas. You quickly realize that to handle the $\forall$/$\exists$ cases, you need to strengthen the induction hypothesis. In other words, you actually need to do induction over formulas-in-context and then closed formulas are the special case where the context is empty. Categorical semantics also suggests the significance of terms/formulas-in-context, as they generally only assign meaning to those and not to terms/formulas on their own. Similarly, the distinction between $\Delta\vdash\varphi$ and $\Delta\cup\{z\}\vdash\varphi$ (with $z\notin\Delta$ and thus necessarily $z$ is not free in $\varphi$) is crucial to understanding the quantifiers.

In categorical semantics, the operation that sends $\Delta\vdash\varphi$ to $\Delta\cup\{z\}\vdash\varphi$ is often called something like $\pi_z^*$. This can be extended to a functor whose right adjoint (if it exists and satisfies a technical condition that corresponds to commuting with substitution appropriately) corresponds to universal quantification (and the left adjoint to existential quantification, though that can be a bit stickier). I'll write $\Delta;\psi\vdash\varphi$ for $\Delta\vdash\psi$ entails $\Delta\vdash\varphi$. The isomorphism of homsets view of adjunction corresponds to the bidirectional rule $$\cfrac{\Delta\cup\{x\};\psi\vdash\varphi}{\Delta;\psi\vdash\forall x.\varphi}\quad\text{or more explicitly}\quad\cfrac{\Delta\cup\{x\};\pi_x^*(\psi)\vdash\varphi}{\Delta;\psi\vdash\forall x.\varphi}$$ because $x$ not being free in $\psi$ is implied for the bottom to be well-formed. This is, of course, universal introduction. The unit and counit are $\Delta;\varphi\vdash\forall x.\varphi$ (where well-formedness again implies that $x$ is not free in $\varphi$ assuming $x\notin\Delta$) and $\Delta\cup\{x\};\forall x.\varphi\vdash\varphi$. These correspond to two commonly taken axioms of universal quantification except that the latter is slightly weaker than usual. The latter does not imply the domain is non-empty, because it is only given for formulas in a context with at least one term, namely $x$, floating around. You cannot infer $\varnothing;\forall x.\bot\vdash\bot$ only $\{x\};\forall x.\bot\vdash\bot$.

Turning more toward your specific concern. A natural interpretation for parameters, to me, is as additional constants in the signature. A syntactic distinction between free variables and dummy (bound) variables is rarely made, but an extreme form occurs in CINNI which uses names for free variables, but deBruijn indices for bound "variables". That said, CINNI is motivated by concerns that only come up for mechanization. I'm not sure notationally distinguishing free variables from bound variables is that helpful.2 The common issues students have that come to mind with this topic are variable capture, which this distinction only partially mitigates, and not substituting for all free occurrences of a variable, which is largely mitigated by the term/formula-in-context notion.

To connect parameters, i.e. constant symbols, to your universal generalization example, I actually like taking a different perspective on universal introduction and existential elimination. To facilitate this, it's useful to also parameterize the notion of term (and thus formula) on a signature $\Sigma$, which, for simplicity, let's say is just a set of constant symbols. So we have $\Delta\vdash_\Sigma t$, which is a term $t$ which only contains variables from $\Delta$ and constant symbols from $\Sigma$, and $\Delta\vdash_\Sigma\varphi$. The universal introduction rule can then be written as $$\cfrac{\Delta;\psi\vdash_{\Sigma\cup\{c\}}\varphi[c/x]}{\Delta;\psi\vdash_\Sigma\forall x.\varphi}$$ which corresponds exactly to the rule used in your informal example. The distinction is clearer for existential elimination. We could imagine a rule like $$\cfrac{\Delta;\psi\vdash_\Sigma\exists x.\varphi}{\Delta;\psi\vdash_{\Sigma\cup\{c\}}\varphi[c/x]}$$ (where $c\notin\Sigma$) which is basically a simple case of Skolemization. If we had extended the variable context instead of the signature, we'd derive $\Delta\cup\{y\};\psi\vdash_\Sigma\varphi[y/x]$ but this would warrant $\Delta;\psi\vdash_\Sigma\varphi[t/x]$ for a term $\Delta\vdash t$ by substituting $t$ for $y$, but constants can't be substituted for so this isn't a problem in the signature extending form. Of course, we want to prove something like $\varnothing;\top\vdash_\Sigma\chi$ not $\varnothing;\top\vdash_{\Sigma\cup\{c\}}\chi$ so ultimately we need to produce a formula that does not involve the new $c$ (constant) term. This is, of course, exactly what the natural deduction rule for existential elimination enforces.

One of my reasons for liking this perspective is the operational, proof search reading that can be made of formulas as used by logic programming languages as described in Uniform Proofs as a Foundation for Logic Programming, for example. The operational reading of existential introduction (read "bottom-up") is to attempt to prove the formula that is existentially quantified by substituting a unification meta-variable for the bound variable. The operational reading of universal introduction, on the other hand, is to attempt to prove the formula that is universally quantified by substituting a fresh constant for the bound variable that only unifies with itself.

Obviously, I'm not suggesting dumping all of this onto a beginning student. I am suggesting these ideas should inform the presentation. To directly answer your question, the benefit of (A) is a superficial reduction in the number of concepts and notational load. It should be clear that not only do I think this is a false economy but that it leads to a blurring of concepts that needs to be overcome later. So I would strongly encourage you to not only make notational distinctions but to clearly draw the conceptual separations that motivate the notational distinctions.

1 This description is not meant as a definition but just a quick way of describing the idea in terms of typical approaches. What I'm really saying is that $x$ is meaningless on its own. It only has meaning with respect to a given (variable) context and then only if that context contains $x$. This means that $\{x\}\vdash x$ and $\{x,y\}\vdash x$ are different terms-in-context.

2 To some extent, this whole "variable" dance is trying to capture a kind of bipartite directed hypergraph. Variables correspond to hyperedges with a head for each occurrence and a single tail that either emerges from its binding site or is loose in the case of a free variable. In other words, syntax involving binders is more graph-like than nicely fits into text. The issue comes up for binders generally and is not specific to quantifiers.

2
On

Not sure what you mean by a "parameter" here. Assuming a "dummy" variable is the same as a bound variable.

Start with a semi-formal argument in Loglish:

(1) $\quad$ for every $x$, $x$ is $F$

This is an assumption. $x$ is a bound (dummy?) variable here.

(2) $\quad$ for every $x$, ($x$ is $F$ $\to$ $x$ is $G$)

This is an assumption. $x$ is a bound (dummy?) variable.

Arbitrarily choose some object in the domain, a: then we have

(3) $\quad a$ is $F$

This is an assumption. $a$ is a free variable.

(4) $\quad a$ is $F$ $\to$ $a$ is $G$

Applied Universal Specification here.

(5) $\quad a$ is $G$

Applied Detachment (MP) here

But a was arbitrarily chosen; so

(6) $\quad$ for every $x$, $x$ is $G$

Applied Universal Generalization here. $x$ is a bound (dummy?) variable.