How should this proof of the associativity of natural number addition be understood?

318 Views Asked by At

The context of this question is the proofs of the basic laws of arithmetic beginning with Peano's axioms which are stated starting with the number 1, rather than 0. The modified principle of induction referred to below is:

If the number $m$ is contained in the number set $M$ whenever $n\in M$ for all numbers $n<m,$ then $M=\mathbb{N}.$

See the end of the question for more on the modified principle induction.

This is actually stated is a subsequent section and forward referenced.

The previously introduced formula for the recursive definition of a function $f:\mathbb{N}\to\mathbb{N}$ is \begin{equation} \underset{x}{\forall}f\left[1\right]=c\land f\left[x^{\prime}\right]=F\left[x,f\left[x\right]\right].\tag{1} \end{equation}

The previously established associative law for addition of thee terms is

\begin{equation} \left(a+b\right)+c=a+\left(b+c\right). \tag{2} \end{equation}

The following is quoted from The Fundamentals of Mathematics, Volume 1:

By (2) we may therefore omit the parentheses in a sum of three terms. In order to be able to omit them in sums with more than three terms, we first define the expression $\sum_{i=1}^{n}a_{i}$ for a given sequence $\left\{ a_{i}\right\} _{i=1,2,\dots}$ of numbers $a_{i}$ recursively by setting

\begin{equation} \sum_{i=1}^{1}a_{i}=a_{1}, \sum_{i=1}^{n+1}a_{i}=\sum_{i=1}^{n}a_{i}+a_{n+1};\tag{3} \end{equation}

for this purpose we need only set $c=a_{1}$ and $F\left[x,y\right]=y+a_{x+1}$ in (1). In particular, we have $\sum_{i=1}^{3}a_{i}=\left(a_{1}+a_{2}\right)+a_{3}=a_{1}+a_{2}+a_{3}$ and $\sum_{i=1}^{4}a_{i}=\left(a_{1}+a_{2}+a_{3}\right)+a_{4},$ for which we again naturally write $=a_{1}+a_{2}+a_{3}+a_{4}.$ Moreover, no parentheses are needed to express addition of such sums, as is shown by the equation \begin{equation} \sum_{i=1}^{n}a_{i}+\sum_{i=1}^{m}a_{n+i}=\sum_{i=1}^{n+m}a_{i}.\tag{4} \end{equation}

The proof of this equation is by argument from $m$ to $m^{\prime}:$ for $m=1,$ Eq. (4) becomes the second of the equations in (3); and from (4) we see by (3) and (2) that

$$ \sum_{i=1}^{n}a_{i}+\sum_{i=1}^{m}a_{n+i}+a_{n+i}=\sum_{i=1}^{n}a_{i}+\left(\sum_{i=1}^{m}a_{n+i}+a_{n+m^{\prime}}\right) $$

$$ =\left(\sum_{i=1}^{n}a_{i}+\sum_{i=1}^{m}a_{n+i}\right)+a_{n+m^{\prime}} $$

$$ =\sum_{i=1}^{n+m}a_{i}+a_{\left(n+m\right)^{\prime}}=\sum_{i=1}^{\left(n+m\right)^{\prime}}a_{i}=\sum_{i=1}^{n+m^{\prime}}a_{i}. $$

We note that none of the properties of numbers (except where they are used as indices) is needed here except (2). The extension of [the commutative property] to sums of more than two terms will be provided [in a subsequent section].

We now prove by means of (4) that any meaningful expression $A$ constructed from the numbers $a_{1},a_{2},\dots,a_{k}$(in this order), and from $+$ signs and parentheses has a value, namely $=\sum_{i=1}^{k}a_{i},$ which is independent of the the distribution of parentheses (for $k=1$ the expression $A$ is to be taken equal to $a_{1}$). For the proof we make use of induction on $k$ in the altered form of [the modified principle of induction stated above] (with $k$ instead of $m$ and with $M$ as the set of numbers $k$ for which the assertion is true). By the construction of $A$ there must exist natural numbers $n,m$ with $n+m=k$ $\left(k\ne1\right)$ such that for expressions $B,C$ formed from $a_{1},a_{2},\dots,a_{n}$ and $a_{n+1},a_{n+2},\dots,a_{m}$ under the appropriate distribution of parentheses, we have $A=B+C.$ Since $n,m<k,$ the induction hypothesis means that $B=\sum_{i=1}^{n}a_{i},$ $C=\sum_{i=1}^{m}a_{n+i},$ so that the desired assertion $A=\sum_{i=1}^{k}a_{i},$ follows from (4). For the case of $k=1$ the assertion is immediately obvious.

I have (at least) three related question regarding this proof.

1) At the point where it is stated, should it be understood that Eq. (4) is of the form $a_{1}+a_{2}+\dots+a_{m+n},$ and therefore not in need of parentheses?

2) What is the purpose of employing the modified principle of induction, as opposed to original principle of induction? I believe it is primarily to show that $n,m\ge1$ and $n,m<k.$

3) Can this proof be paraphrased in such a way as to communicate the essential points without the detailed mathematical jargon?


The entire introduction of the modified principle of induction is this:

\begin{equation} \Theta:=a<b+1\iff a\le b. \end{equation}

The prime notation $m^{\prime}$ means $m^{\prime}=m+1.$

From the principle of induction we can now derive the following modified principle of induction: If the number $m$ is contained in the number set $M$ whenever $n\in M$ for all numbers $n<m,$ then $M=\mathbb{N}.$ The induction hypothesis now reads: "$n\in M_{1}$ for all numbers $n<m$"; and there is no special initial case. For the proof we consider the set $M^{*}$ of numbers $m$ with $n\in M$ for all $n<m.$ Then the hypothesis of our new principle simply states that $M^{*}\subseteq M.$ Since $n<1$ is not valid for any number $n,$ we get $1\in M^{*}.$ By $\Theta$, any number $n<m^{\prime}$ is $<m$ or $=m.$ If we now assume that $m\in M^{*},$ then $n$ lies in $M$ not only in the first case but also, since $M^{*}\subseteq M,$ in the second case as well. Thus we have derived $m^{\prime}\in M^{*}$ from $m\in M^{*},$ so that by the argument from $m$ to $m^{\prime}$ we have $M^{*}=\mathbb{N}$ and thus also $M=\mathbb{N}.$


One problem I'm having with this proof is identifying what has not yet been proven.

I'm assuming the proof presented in the book might be paraphrased this way:

Any expression of the form

$$ \sum_{i=1}^{n}a_{i}+\sum_{i=1}^{m}a_{n+i} $$

is composed of two expressions for which we have already shown our proposition to hold in "previous" iterations. This is why we use baseless (having no specific base case) complete induction (AKA modified or strong induction), rather than basic (having a base case) complete induction. Baseless induction allows us to reference arbitrary previous cases.


Stated differently what we are proving is that using the indexed number variables $\left\{a_i\right\}_{i=1}^k$ any allowable expression consisting of the tokens $a_i,\left(\_\right),+,$ designates a specific number, and may be rewritten as

$$a_1+a_2+\dots+a_k$$

in which the order of appearance of the $a_i$ is preserved, and the resulting expression designates the same number.


Yet another way of thinking about this would be to formulate a parsing algorithm which converts allowable expressions into a parse tree. The root of such a tree in the $k^{th}$ case would be the "exposed" $+$ sign appearing between the variables $a_n$ and $a_{n+1}.$ Clearly both branches off the root are shorter than $k$. We then state our rewrite rule as

$$\left(a_1+\dots+a_n\right)+\left(a_{n+1}+\dots+a_{n+m=k}\right)\mapsto{a_{1}+\dots+a_{k}},$$

which is allowable since we have already proven the case for each branch.


But that seems to be what we have already done by proving associativity for three terms. The following is simply an extension of my proof of that associativity. My proof (for three terms) is nothing but a reformulation of the proof given in the book using function notation rather than binary $+$ syntax.

Define the set of function:

$$ \Phi=\left\{ \varphi_{a\in\mathbb{N}}:\mathbb{N}\to\mathbb{N}\backepsilon\varphi_{a}\left[x^{\prime}\right]\equiv\varphi_{a}\left[x\right]^{\prime}\land\varphi_{a}\left[1\right]\equiv a^{\prime}\right\} . $$

To relate this to conventional notation we also define

$$ \left[\_+\right]:\mathbb{N}_{1}\to\Phi $$

so that

$$ \left[a+\_\right]\equiv\varphi_{a}, $$

to be interpreted as $$ a+b:\mathbb{N}\times\mathbb{N}\to\mathbb{N}\text{ where }a+b\equiv\varphi_{a}\left[b\right]. $$

$$ \left(a+b\right)+c=\varphi_{a}\left[b\right]+c=\varphi_{a+b}\left[c\right]=a+b+c. $$

$$ \left(a+b\right)+\left(c+d\right)=\varphi_{a}\left[b\right]+\varphi_{c}\left[d\right] $$

$$ =\varphi_{a+b}\left[\varphi_{c}\left[d\right]\right]=\varphi_{a+b+c}\left[d\right]=a+b+c+d. $$