Trying to Understand the Axiom of Induction in the context of formalizing Natural Numbers

334 Views Asked by At

I have a question about formalizing the Natural Numbers, and the role of Peano's fifth axiom within his scheme. Let me describe what is my current understanding, and then get to some of my confusions ... and questions. But please correct any misunderstanding as well!

OK, here goes:

Peano's first four Axioms for defining the Natural Numbers are:

$1.$ Zero is a natural number.

$2.$ Every natural number has a successor which is also a Natural Number

$3.$ Zero is not the successor of any natural number.

$4.$ If the successor of two natural numbers is the same, then the two original numbers are the same.

I understand that with these four axioms, any model will have to contain an infinite and ordered sequence of objects denoted by $0, s(0), s(s(0)), ...$ ... which seems to be isomorph to the natural numbers. However, I understand that there could be other ('rogue') elements as well. So, it is my belief that the fifth axiom was introduced to try and rule out such rogue elements. In other words, we are looking for:

$5.$ There are no Natural Numbers other than those specified by the first four axioms.

However, instead of this kind of sentence, I see some kind of inductive axiom:

$5'$. If zero has some property, and if for any object it is true that if it has some property $P$, then its successor has the property as well, then all properties have that property P

Now, I am well familiar with proofs by induction and how they work, and how this axiom formalizes induction. I am also familiar with the fact that with this axiom we can prove things that we would like to prove about the natural numbers and that without the axiom we cannot (e.g. without the axiom we cannot prove $\forall x (x \neq 0 \to \exists y \ x = s(y))$, but with the axiom we can). So, my question is not about what the axiom says, how it works, or how it is useful. Instead, my question is simply this: why do I always see $5'$, and not $5$?

Is it because there is no direct logical translation/formalization of $5$? I would indeed have no idea how to formalize $5$ myself ... It seems to be about sets ('The set of natural numbers is the set containing $0, s(0), ...$ and nothing else') so could we formalize it using some set-theoretic axioms? I see some things to this effect, e.g. Von Neumann's recursive scheme of defining $0$ as $\emptyset$, and $s(a)$ as $a \cup \{ a \}$. But how would that rule out any rogue elements? We still need some way to say that 'nothing else is in the set'. Definitely a gap in my understanding there ...

OK, back to the axiom of induction. I can see how the axiom of induction actually does rule out any rogue elements (and without resorting to set theory): if there would be rogue elements other than the elements $0,s(0), ....$, then it doesn't seem like we would be able to say with certainty that all objects have property $P$ just because the elements $0,s(0), ....$ all have property $P$. So, the fact that we can say this with certainty (given the truth of the inductive axiom), must mean there are no rogue objects. Is that how it goes?

However, what if for any property that we come up with, we can in fact introduce some rogue element that has that property as well? That seems a tall order ... but in order for the inductive axiom to rule out such rogue elements, we'd have to be talking about a property that all the elements $0,s(0), ....$ have. And if they all have it ... then it must be some pretty 'lame' or trivial property, and so why couldn't we just give that to such a rogue elements as well? (Indeed, 'conditional' properties like 'if you're even and greater than $2$, then you are the sum of two prime numbers' can easily be satisfied by any rogue element by making them not even). So, maybe the axiom of induction is not enough to rule out rogue elements? Again, definitely a gap in my understanding here as well.

Finally, I am reading something about a difference between a first-order logic formulation of the axiom of induction, and a second order formulation, which I understand to be:

First-order Axiom of Induction: This is really not a single axiom, but an infinite set of axioms, namely: given any formula $\phi(x)$, we have as an axiom: $(\phi(0) \land \forall x (\phi(x) \to \phi(s(x)))) \to \forall x \ \phi(x)$

Second-order Axiom of Induction: This really is a single axiom, namely: $\forall \phi \ ((\phi(0) \land \forall x (\phi(x) \to \phi(s(x)))) \to \forall x \ \phi(x))$

Now, from what I gather from browsing Wikipedia, the second-order Axiom of Induction really does only allow models isomorphic to the natural numbers, and thus prevent any rogue elements, but the first-order axiom (schema) does not. OK, here I am really lost. Why would this be? I have a feeling there is some pretty advanced meta-theory at at play here that may well go over my head ... or is there a relatively easy way to explain this?

Oh, one more question: if indeed we have to revert to a second-order formalization of the axiom of induction, does that mean that there is no first-order logic formalization of the natural numbers at all? Or would a first-order formulation be possible using set-theory?

If it's not too much trouble to fill in these gaps in my understanding, I would really appreciate it. Alternatively, some good text or online resource that explains this to a relative lay-person would be appreciated as well. Thanks!!

2

There are 2 best solutions below

0
On

The Principle of Mathematical Induction can be stated in the language of set theory as :

$~~~~~\forall P \subset N: [0\in P~\land ~ \forall x\in P: S(x) \in P ~ \implies ~ P = N]$

It is based on the idea that any element of $N$ but $0$ itself can be reached by a process of repeated succession starting at $0$, i.e. every element of $N$ is accessible from $0$:

$~~~~~N=\{0, ~S(0), ~S(S(0)), ~S(S(S(0))), ~\cdots ~\}$

This notion can be formalized for any set $X$ (possibly finite) and $x_0\in X$ (the "first" element) and $f:X \to X$ (the "successor" function). It can be formally proven using a simplified form of natural deduction and basic set theory. (See link below.)


Example 1 (with no "rogue elements")

Suppose $X = \{0, 1 \}, f(0)=1$, and $f(1)=0$. Then induction holds on $X$ with "successor" function $f$ and "first" element $0$.

$~~~~~X=\{0, f(0)\}$


Example 2 (with "rogue element" $1$)

Suppose $X = \{0, 1 \}, f(0)=0$, and $f(1)=1$. Here, $1$ is a "rogue" element when the "first" element is $0$ since the element $1$ would then not be accessible from $0$.


EDIT: See my formal proof: Induction iff Accessible (228 lines -- quite challenging). In the set of natural numbers $N$, any number is “accessible” from $0$ by repeatedly going from one number to the next, starting at $0$. In other words, there are no elements of $N$ that are “isolated” from $0$ under the successor function S. As I attempt to show at this link, this self-evident property of the natural numbers turns out to be logically equivalent to the usual Principle of Mathematical Induction.

7
On

The keys to understanding why we can't axiomatize the natural numbers up to isomorphism with a first-order theory are the compactness theorem and the Lowenheim-Skolem theorem. The Upward Lowenheim-Skolem theorem implies that if a first-order theory in a countable language has an infinite model, or even arbitrarily large finite models, then it has models of every cardinality. Thus, except in the cases of finite structures, first-order theories can't characterize anything up to isomorphism.

In the case of the natural numbers, we can give an argument from the compactness theorem (which says that a theory has a model if and only if every finite subtheory has a model). Add a constant symbol $c$ to the language of arithemetic, and add the axioms $\{c>\mathbf n:n\in \mathbb N\}$ to your favorite first order theory of arithmetic. Every finite subtheory will be satisfied by $\mathbb N$ since there will only be finitely many of the $\{c>\mathbf n:n\in \mathbb N\}$ included in the subtheory, and we can interpret $c$ as something sufficiently big to satisfy it. So every finite subtheory has a model, and hence by compactness there is a model of the whole theory. But the interpretation of $c$ must be a "rogue" element since it is greater than every natural number. Also, since this theory has a countable language, it has a countable model, so we even have a countable structure that satisfies the axioms, but is different from $\mathbb N.$

To see that the second-order axiomatization does characterize $\mathbb N$ up to isomorphism, we need to think about how we make "has no rogue elements" rigorous at all. We call a set inductive if it contains $0$ and is closed under the $S$ operation. We say $\mathbb N$ is the smallest inductive set, so in other words something is a natural number and not a "rogue" if it is contained in every inductive set. In second-order arithmetic, we can express this property as $$ N(x) := \forall A(A0\land \forall y (Ay\to ASy)\to Ax).$$ And then we can prove $\forall x N(x)$ by applying the induction axiom to the property $N,$ which shows that all models of second order arithmetic are isomorphic to the natural numbers.

Finally, since we can do all math in set theory, all the above considerations can be formalized in ZFC. So e.g. ZFC shows that its natural numbers are a model of its internal version of second order arithmetic, and that they are the only model up to isomorphism. However, ZFC is a first-order theory, so Lowenheim-Skolem applies to it, and it has a whole mess of models. And a similar compactness argument shows that ZFC has models whose natural numbers are a nonstandard model with "rogue" elements, even though the model believes they are the smallest possible inductive set.

So this leaves us with a bit of a conundrum. We think of the natural numbers as $\{0,1,2,\ldots\}$ with the understanding that "$\ldots$" means that we include all the successive iterations of $S$ and nothing else, but how can we formalize what this means? Is this just a primitive concept that we have to live with an informal notion of or is there some noncircular way to make sense of it? In practice, we work in a foundational system of first order set theory, which allows us to formalize $\{0,1,2,\ldots\}$ as being minimal, i.e. the intersection of all inductive sets. This certainly isn't circular, but how do we know the minimal set actually only contains what we want? When we study models of our foundational system, we see that some of them will inevitably get this wrong: they don't include the "true" natural numbers and smallest inductive set included in the model is actually bigger and contains rogue elements. So there's no way we can show literally every natural number is represented by a term $S\ldots S\emptyset$ within first-order set theory (naively, this is something we can't even express, but this clinches that there is not even some creative way of doing so).