How does $\exists !x:P(x)$ allow the definition of a new constant in formal logic?

205 Views Asked by At

In everyday math, if we ever prove the existence of a unique object with a certain property, we tend to give it a name and refer to it as "the" such-and-such object moving forward. For example, in ZFC set theory, an important early theorem is

$$\exists!x\forall y:y\notin x.$$

In words, "there is only one empty set." Thus we are (intuitively) justified in defining the constant $\varnothing$ to refer to the $x:\forall y:y\notin x.$

However, to my knowledge there is no rule in first-order logic that actually says you can do this. How is this procedure formally justified?


One idea I have is to treat newly defined constant symbols like $\varnothing$ as abbreviations that do not appear in the formal statements of the language. However, one couldn't simply use a rule: "replace every occurrence of $\varnothing$ with ... ," because the "..." would have to be a logical formula meaning "the unique $x$ such that $\forall y:y\notin x,$" which I don't think can be translated into first-order logic. It would have to be something a bit more complicated: For example, every proposition of the form $P(\varnothing)$ could translate to a formal version $\forall x\big((\forall y:y\notin x)\to P(x)\big)$. But now the proposition that we have written as $P(\varnothing)$ in our informal proof has become a formula with a seemingly very different structure in the formal proof (a formula that contains two extra quantifiers and an extra connective), so it isn't at all clear (to me) that the informal proof still translates into a formal proof in any straightforward way.

2

There are 2 best solutions below

0
On

Look at Section 74 of Kleene's "Introduction to Metamathematics."

First-order logic does not address this, and, most people committed to first-order logic do not understand that this is the case. The reason first-order logic does not address this is that the development of a language in ordinary mathematics generates a sequence of signatures as definitions are made. By contrast, first-order logic hands you a signature and says, "Go."

It is true that explicit definitions for functions and constants will be eliminable from these signatures. But that is the nature of building a vocabulary. Skolem's primitive recursive arithmetic had been based upon his disagreement with Russell's theory of descriptions. His notion of the use of descriptions has never been represented formally because almost all description theory is motivated by Russell's treatment of descriptions as a form of quantifier. This is called "attributive use." Where controversy does exist, it involves "referential use" as presented by Strawson and probably assumed by Frege. What Skolem says about descriptions is different from both of these views.

As commonly applied, Tarski's semantic conception of truth treats constants like Russell had treated definite descriptions. And, Russell's declaration that definitions are mere abbreviations has influenced how these matters are commonly understood. In his book "Definition," Richard Robinson examines what Whitehead and Russell wrote about definition in "Principia Mathematica." It is terribly inconsistent.

If I recall correctly, the October 2015 thread on free logic at the FOM mailing list will have some debate on issues relevant to your question.

Caveat emptor.

0
On

The definitional extension gives you a new logical language (a.k.a. signature) with the new symbol added as a constant letter, and a new theory with the definitional property as an additional axiom.

The most intuitive motivation for this uses semantic arguments about models:

Whenever you have a model of the original theory, the definitional property tells you how to extend the model with an interpretation of the new symbol, such that it becomes a model of the extended theory. This automatically guarantees that

  1. Every logical consequence of the new theory that doesn't mention the new symbol is also a logical consequence of the old theory. In other words, the new theory is a conservative extension of the old one.

The "$!$" part of the definitional theorem guarantees that for each model of the original theory, there is exactly one way to complete it into a model of the extended theory while keeping the universe unchanged. Thus,

  1. Whenever you have a model of the original theory, formulas in the extended language can be considered to have well-defined truth values -- namely the ones the unique completion of the interpretation give them.

We can also approach the connection from a purely syntactic direction, where formal proofs are just a symbolic game and you don't consider models to be important for what you're doing:

  1. (Syntactic analogue of 1). Whenever $P$ is a formula in the original language that the extended theory proves, the original theory also proves $P$.

  2. (Partial syntactic analogue of 2). For every formula $P$ in the extended language, there is a $P'$ in the original language such that the extended theory proves $P\leftrightarrow P'$.

These properties have as immediate conclusions:

  1. (Equivalent to 3). If $T$ is a consistent extension of the original theory in the original language, then $T$ with the definitional axiom for $\varnothing$ added is a consistent theory in the extended language.

  2. (Corollary of 4). If $T$ is a complete extension of the original theory in the original language, then $T$ with the definitional axiom for $\varnothing$ added is a complete theory in the extended language.

in which the conceptual relation between them looks much nicer.

The proofs of (3) and (4) are where you idea of a translation comes in:

For example, every proposition of the form $P(\varnothing)$ could translate to a formal version $\forall x\big((\forall y:y\notin x)\to P(x)\big)$.

The meat of proving (3) now consists of showing by structural induction over proofs that whenever the extended theory proves $P(\varnothing)$, then the original theory proves its translation.

After this induction is done, we note separately that if $P$ doesn't mention $\varnothing$, then the original theory proves $P$ equivalent to its translation.

Note well that the translation is applied only to the conclusions of proofs. The axioms of the original theory still appear untranslated as axioms of the extended theory, and it's a purely local job of the translation of proofs to conclude their translation when the original proof appeals to an axiom.

You could also translate formulas with $P(\varnothing) \rightsquigarrow \exists x(x=\varnothing \land P(x))$ instead of $\forall$ -- in the presence of $\exists! x(x=\varnothing)$, the two translations are easily proved to be equivalent.

For proving (4) it will be more convenient to work with a different translation where the quantification over $\varnothing$ happens for each atomic formula in $P$ rather than at the root of $P$ itself. Then the desired property arises almost by itself, by induction over the structure of $P$.

(Since the translation mapping is not mentioned in the statements of (3) and (4), we're free to choose different translations for each purpose. The per-atomic-formula translation would be cumbersome to use for (3) because under that translation, substitution of terms can cause wrappers to spring into existence at the formula level, so the induction steps dealing with quantifier rules would need a lot more work.)