I've got Serge Lang's Undergraduate Algebra (2nd edition). In the Appendix is a treatment of the Peano Axioms, but, as he says: "
The rules of the game from now on allow us to use only sets and mappings."
Good. Here they are:
- There is an element $0 \in \mathbb{N}$
- We have $\sigma(0) \ne 0$ and if we let $\mathbb{N}^+$ denote the subset of $\mathbb{N}$ consisting of all $n \in \mathbb{N}$, $n \ne 0$, then the map $x \mapsto \sigma(x)$ is a bijection between $\mathbb{N}$ and $\mathbb{N}^+$
- If $S$ is a subset of $\mathbb{N}$, if $0 \in S$, and if $\sigma(n)$ lies in $S$ whenever $n$ lies in $S$, then $S = \mathbb{N}$.
He then says
We often denote $\sigma(n)$ by $n'$ and think of and think of $n'$ as the successor of $n$. The reader will recognize 3. as induction. We denote $\sigma(0)$ by $1$.
This supposedly implies $\sigma$ is a successor function, i.e., $\sigma(n) \implies n + 1$. To me it seems 2. and 3. must be working together to produce the successor function, although I can't see it. What necessarily forces $\sigma$ to behave as a successor function $\sigma(n) \implies n + 1$?
To elaborate on coffeemath's comment, let's consider this function, $f_x: \Bbb N \to \Bbb N$:
$f_x(0) = x\\f_x(\sigma(k)) = \sigma(f_x(k)).$
Note that, in turn, we have another function: $x \mapsto f_x$ (which goes from $\Bbb N \to \Bbb N^{\Bbb N}$: $\Bbb N^{\Bbb N}$ is just another way of naming the set of functions $\Bbb N \to \Bbb N$). Let's call this second function $g$.
We will be especially interested in $g(1)$. let's unpack what function $g(1)$ is:
$f_1(0) = 1\\f_1(\sigma(k)) = \sigma(f_1(k)).$
Hence, $f_1(1) = f_1(\sigma(0)) = \sigma(f_1(0)) = \sigma(1)$.
In order to save time, let's look at the set: $T = \{n \in \Bbb N: \sigma(n) = f_1(n)\}$. We have already seen this set contains $0$ and $1 = \sigma(0)$.
Now suppose that $k \in T$, so that $f_1(k) = \sigma(k)$. By the definition of $f_1$:
$f_1(\sigma(k)) = \sigma(f_1(k)) = \sigma(\sigma(k))$, that is $\sigma(k) \in T$. So by rule 3, $T = \Bbb N$.
We thus conclude that $f_1 = \sigma$, since they agree on every domain element.
You probably know the function $f_x$ better as: "add $x$ to $k$", that is, the function that sends $k \mapsto k+x$, so hopefully you see now that $\sigma$ is thus the function that sends $k \mapsto k+1$. The reason $\sigma$ is not DEFINED as: $\sigma(0) = 1$, and $\sigma(k) = k+1$, is that we need a reasonable definition of "+" to define $\sigma$, then, and we want to USE $\sigma$ to define +.
The format:
$h(0) = a\\h(\sigma(k)) = f(h(k))$
(where $f$ is some function from the set $a$ lives in to that same set) is called a recursive definition of $h$, and allows us to specify $h$ for any natural number by just specifying the "rule" $f$, and the "seed value" $a$.
For example, if $a = 1 \in \Bbb R$, and $f: \Bbb R \to \Bbb R$ is the "doubling function" ($f(x) = 2x$), we get:
$h(0) = 1\\h(k+1) = 2\cdot h(k)$
which is a recursive definition of the function $h:\Bbb N \to \Bbb R$ given by $h(k) = 2^k$.
The idea is: rules 2 & 3 allow us to make such recursive definitions, because at each step, we can compute $h(n)$ in terms of $h$ evaluated at "smaller" values for $n$ we have previously computed.