On formalising logic

484 Views Asked by At

Recently, I became fascinated with Set Theory and I am willing to learn more. Although, there are some aspects that I would like to understand before doing it. A lot of questions concerning the foundations of Mathematics as been posted and I went to read some of them. My question is closely related to Set theoretic concepts in first order logic, to be more precise, in the answer given by Peter Smith.


Let’s (naively) pretend that someone not familiar with mathematical reasoning or sets wants to lear about set theory and we want to teach them. I’m trying to understand how this could be done in the most self contained way (since that person don’t have any strong mathematical background).

I think it would be really helpful to lear some logic first. So, I went to search and I pick up the book Logic and Structure by Dirk van Dalen. Although, there are a lot of mathematical terms that “appear” to come from set theory. But this can be solved: just aboard logic “verbally” (as in the link I mentioned).

I was trying to do this but I think this is quite “awkward” to say the least. For example, Dalen starts by presenting the alphabet of propositional logic and show how propositions can be constructed. In his definition, he uses the term smallest set

Definition $2.1.1$ The language of propositional logic has an alphabet consisting of

$(i)$ proposition symbols: $p_0, p_1, p_2, \dots$,

$(ii)$ connectives: $\wedge$, $\vee$, $\rightarrow$, $\neg$, $\leftrightarrow$, $\bot$,

$(iii)$ auxiliary symbols: $($, $)$.

Definition $2.1.2$ The set $\mathsf{PROP}$ of propositions is the smallest set $X$ with the properties

$(i)$ $p_i \in X$ ($i \in N$), $\bot \in X$,

$(ii)$ $\varphi, \psi \in X \implies (\varphi \wedge \psi), (\varphi \vee \psi), (\varphi \rightarrow \psi), (\varphi \leftrightarrow \psi) \in X$,

$(iii)$ $\varphi \in X \implies (\neg \varphi) \in X$.

Someone that has studied (at least naive) set theory will understand what this means, but can one create the same without using this ideas?

What I want to know is, to develop propositional calculus, how can one express the second definition without using the term smallest set?

Because this is quite a good idea to model the set of propositions. It allows us to prove the Induction Principle

Theorem $2.1.3$ (Induction Principle) Let $A$ be a property, then $A(\varphi)$ for all $\varphi \in \mathsf{PROP}$ if

$(i)$ $A(p_i)$, for all $i$, and $A(\bot$),

$(ii)$ $A(\varphi), A(\psi) \implies A((\varphi \square \psi))$,

$(iii)$ $A(\varphi) \implies A((\neg \varphi))$.

This principle is good, for example, to show that every proposition will have one and only one truth value. But how can I prove it without the previous notion of smallest set?

Is there even a way to handle propositional and first order logic without using set theory so then I can use it to work with set theory? If yes, how? (Or where can I find such an approach in the literature?)

Thank you for your attention!

4

There are 4 best solutions below

16
On

In your second textbook definition 2.1.2, "The set PROP of propositions is the smallest set X..." is only a meaningful proposition when semantically interpreted in its set-theoretic metalanguage. At the classic propositional logic level, we can only say it's either true or false per law of excluded middle, corresponding to whether this definition picks out such a really existent (abstract) math object called set or not satisfying all later properties. If you try to further prove it's really the smallest such set, then you have to employ some predicate from set-theoric or any alternative metalanguage including the semantic notion of order relation.

That being said, in the case of finite natural number we can express in FOL alone, e.g., that there're at least 3 such P objects albeit in a complicated fashion as follows: $$ ∃x ∃y ∃z (P(x) ∧ P(y) ∧ P(z) ∧ x \neq y∧ x \neq z ∧ y \neq z) $$

So generally to prove there're at least n objects in FOL we need to naturally deduce to a sentence with n existential quantifiers and (1+2+..+n) identity negations. But your PROP set example is obviously infinite in nature in terms of its members even though it may be a definite description, so FOL really cannot express by itself for smallest via above at least approach, you need to add order predicate relation from its metalanguage.

How? One approach at the set-theoretic metalanguage level is to first prove a lemma like below:

If PROP is the intersection of a collection of all sets each of which satisfies (i)–(iii) of definition 2.1.2, PROP must also satisfy (i)–(iii).

Then PROP must be the smallest such set since when you take the intersection of a bunch of sets, the result is always a subset of all of the original sets. So the smallest set predicate is now paraphrased and satisfied by the concept of subset and intersection of sets at the metalanguage level.

Finally the 2.1.2 PROP definition is very similar to inductive set-theoretic definition of natural numbers N or Peano axioms. The advantage to define PROP in this manner is to really pave the way for the future application of mathematical inductive proof regarding any general property of PROP, akin to prove general property of natural numbers.

4
On

There are several options. One is to use Backus-Naur Form, which is a concise metalanguage to present context free grammars.The idea here is to view formulas as sequences of so-called terminal symbols, which are recursively generated via so called production rules. Every production rule defines a so called non-terminal symbol together with sequences of terminal and non-terminal symbols, which replace the non-terminal symbol in the generation.

For example the terms of a first-order language with constants and function symbols are generated via the BNF:

$t \space ::= \space x\space |\space c\space |\space ft_1 \ldots t_n$,

where $x$ is a variable of the language, $c$ a constant, $f$ an $n$-place function symbol and $t_i$ terms.

This BNF tells you that the non-terminal $t$ may be replaced by either a terminal $x$ or a terminal $c$ or any sequence consisting of a terminal $f$ followed by the non-terminals $t_1, \ldots, t_n$. In other words: A term is either a variable or a constant or a function symbol followed by an appropriate number of terms.

Analogously, the formulas for a language with a logical signature consisting of $\neg, \wedge, \forall$ is generated by the BNF:

$\varphi::= Rt_1 \ldots t_n \space | \space \neg \varphi \space \space | \space (\varphi \wedge \varphi') \space | \space \forall x \varphi \space$,

where $R$ is an $n$-place relation symbol, $t_i$ are terms and $x$ a variable of the language.

3
On

This reference to sets in "the smallest set $X$ such that..." can easily be circumvented.

You can define a proposition as any assembage that appears in a sequence of assemblages $A_1, A_2, \dots, A_n$, in which each assembage is formed from the preceding ones in certain well-defined ways, as in Definition 2.1.2. Then whenever you need to use a proposition in your formal language, you can require that a suitable finite sequence of assemblages first be exhibited.

The problem is more substantial when you seek to prove propositions about assemblages, propositions, formal proofs, etc. In this case, you must decide whether the meta-theory you are working in is some form of set theory, which is easiest for purposes of exposition, or some form of arithmetic. Much of what is proved in logic about formal proofs, etc., could be formulated in fairly weak systems of arithmetic (where integers correspond via an effective bijection to assemblages). The main problem is that few people have the stomach for the details involved, and certainly, no one would want to inflict all of that on a beginner.

Edit: Your comments clarify what your concerns are. Before answering the questions, I think it would be clearest if I talked a little bit about the ways you can look at a system of logic as a foundation for mathematics.

In the most basic use of logic, you are merely setting up a system of symbols to formalize the statements and proofs of ordinary mathematical argument, say with set theory as a foundation. This is a purely syntactic level.

You have a certain intuitive conception of what the statements are meant to represent, and you need to check the following two things. First, you want to check that you consider the methods of deduction in your system to be valid, i.e. that the system is sound. That is, applied to true statements, they should result only in true statements. Second, you want to check that all the forms of argument ordinarily considered valid in mathematics have counterparts in the formal deductive system, i.e. that the system is complete.

At this purely syntactic stage, you have no formalized concept of the "meaning" of a statement. Therefore, the two points above cannot be proved in any rigorous sense. Soundness can be checked only by convincing yourself that, according to your intuitive sense of what the statements are supposed to "mean," the rules of deduction are acceptable. Completeness, at this stage, can be thought of as experimental in the sense that no one has observed a mathematical proof regarded (by humans) as valid that could not, with sufficient effort, be formalized in the system.

Of course, the forms of reasoning humans consider valid are based on our innate conceptions of logic.

Now let's think about your concern about a vicious circle. At this stage, you have a formal system of proof that you're satisfied with. There's nothing to stop you now from starting to use it to prove theorems of set theory. You don't really need to know any metatheorems to make use of it this way, so there can be no vicious circle.

In practice, you may need to prove a few metatheorems to convince yourself of the intuitive soundness and completeness of your system. Mostly, these will consist of various shortcuts that you can employ in formal proofs.

All of these basic metatheorems can be formalized in a fairly weak system of arithmetic (including some basic form of induction). But if your purpose is really to convince yourself of the two points I mentioned, whose content is intuitive anyway, this shouldn't matter, since all you need as you go along is to check that the metatheorems use no more than intuitively obvious facts about arbitrary finite sequences of characters written down on paper.

The next stage in developing logic is semantic, in the sense that you are concerned with giving a formal definition of the "meaning" of statements and proving metatheorems including, at minimum, formal equivalents of the two points above.

Let's think about what's involved at this stage. For the sake of simplicity, let's say that, instead of set theory, you're concerned with groups or, more generally, with structures consisting of a bunch of objects and a binary operation on them.

It won't have escaped your attention that the word "bunch" above is a cop-out to avoid using the word "set." There's really no way around the fact that a group is bound to be, in some sense at least, something like a set, as is the group operation, which is a "bunch" of triples.

Even set theory, ultimately, is about a "bunch" of objects called sets and a binary relation denoted $\in$ between them, itself a "bunch" of pairs of objects.

I think this makes it clear that the natural medium in which to formulate semantic questions is going to be akin to some version of set theory.

How is this done? Let's return to the example of groups. Now some logicians, for good reasons, like to consider a group to be a set $G$ together with a binary operation (representing the group operation), a unary operation (representing the taking of inverses) and a constant (representing the unit element). But to keep things simple, let's say that the language of groups consists instead of just a binary operation. (There are some technicalities here regarding the status of the binary relation $=$, but they are of no real consequence. For definiteness, we'll assume that the symbol $=$ is interpreted by the actual equality relation in $M$.)

Now you define a structure ${\cal M} = (M,\cdot)$ in the language of groups to be a set $M$ together with a binary operation $\cdot$ on $M$. (I'll just write $M$ for $\cal M$ below.) You define, by induction on the complexity of a formula $\varphi(v_1, \dots, v_n)$ with free variables $v_1, \dots, v_n$, the concept of such a structure $M$ satisfying $\varphi$ when $v_1, \dots, v_n$ are replaced with particular elements $x_1, \dots, x_n$ of $M$.

A statement is just a formula $\varphi$ with no free variables, and a theory is a set $T$ of statements. For example, $T$ could be the set of axioms for groups. A structure $M$ is a model of $T$ if it satisfies each statement in $T$.

At this point we can begin to think about how to formalize the two points from before. Let a theory $T$ and a statement $\varphi$ be given. Then we'll say that $\varphi$ is a syntactic consequence of $T$ if there there is a formal proof (according to your system) of $\varphi$ using only axioms in $T$ as assumptions. We'll say that $\varphi$ is a semantic consequence of $T$ if every model of $T$ also satisfies $\varphi$.

Now the soundness of your system of proof means that whenever a statement $\varphi$ is a syntactic consequence of a theory $T$, it is also a semantic consequence of $T$. The completeness of the system is the converse, namely that whenever $\varphi$ is a semantic consequence of $T$, it is also a syntactic consequence of $T$.

Gödel's completeness theorem states that certain well-known systems of proof, which are fairly easily shown to be sound, are also complete. In other words, if you choose one of these systems, then you have a good system of formal proofs that allows you to deduce a statement if and only if you ought to be able to deduce it.

Now if we switch from group theory to set theory, we'll say that a structure in the language of set theory is a set $M$ together with a binary relation $E$ on $M$, which is meant to represent set membership (but which need not be anything like the actual $\in$ relation between elements of $M$). If the pair $(M,E)$ satisfies, say, the axioms of ZFC, then we'll say it's a model of ZFC.

So here you have to face the reality that in the universe you're working in (at Level 2), you are using some version of set theory, and some of the formal objects (at Level 1) within your theory are models $(M,E)$ of some version $T$ of set theory (perhaps quite a different one), with $T$ also being a formal object of the theory. It can be a bit disconcerting to manipulate similar objects at Levels 1 and 2 simultaneously.

Now I'll try to answer your questions.

  1. I can use naive set theory as my meta theory to develop Mathematical logic, right? (In this way I would be able to use sets, functions, relations, and other “structures” because they come from it)

The answer is yes. But think about what you mean by "develop mathematical logic."

If what you mean is to set up the basic syntactic system in which you will prove mathematical statements, then you need, at most, some very obvious statements about finite sequences of characters on paper. While set theory is a convenient language in which to discuss this, it isn't necessary.

If what you want to do is more elaborate, for example to prove rigorously various facts about what can or can't be proved from particular sets of axioms, then the specific set of axioms of your metatheory may be very relevant. For example, if your metatheory is ZFC + (There exists a strongly inaccessible cardinal), then you can prove that the theory ZFC is consistent (by exhibiting a model of ZFC). But if your metatheory is ZFC, then either your metatheory is inconsistent or your metatheory cannot prove that the theory ZFC (a formal object within the metatheory) is consistent. (This is a consequence of Gödel's incompleteness theorem.)

  1. Back to sets: We already use some ideas from logic to work in naive set theory.. wouldn’t that be circular?

I think what you mean here is that, for example, we use the idea of conjunction in defining the intersection of two sets, etc. Please correct me if I'm wrong.

This is not a problem because such uses of conjunction ultimately amount to using the conjunction symbol in certain places in formal proofs. As long as you are obeying the rules of formal proofs, including the rules for manipulations involving the conjunction symbol, you are following the rules of the game.

Of course, what you are really doing is carrying out arguments which, psychologically speaking, are in the semantic realm, but which could be reduced with some work to formal syntactic proofs.

  1. I was intrigued by the ”arithmetic way”. Can you please give an example of how that can be done? Or some literature?

With respect to syntax, the arithmetization of formal proofs will necessarily be discussed in any source that presents a detailed proof of Gödel's incompleteness theorem. Two possible sources would be Mathematical Logic by Cori and Lascar, or Fundamentals of Mathematical Logic by Hinman. The theory of recursive functions is fundamental for this.

On the other hand, I'm just not familiar at all with the ways in which semantic logical concepts (most naturally developed within set theory) can be transposed into arithmetic. For example, I don't know how complex a system of arithmetic you would need in order to formulate and prove meaningful versions of Gödel's completeness theorem. In any case, I anticipate that this area must be highly technical. If all you care about is soundness, I suspect that you can get by with less. Once you've learned enough logic (using set theory as your metatheory), you could ask a new question specifically about this.

The bottom line, then, is that there is no vicious circle if all you want to do is use your system of formal proofs to write down mathematics. However you must take a certain amount of mathematics (a subset of ZF) on faith if you wish to prove that your system is sound and complete.

0
On

I think your main issue is that you do not realize that your version of PROP is impredicative, and such kind of definitions would arguably require significant set-theoretic commitment. After all, you are asking for the smallest set that is closed under some rules, so on its face that requires you to be able to contemplate all such sets and somehow obtain their intersection from them as a whole.

But that is not the way to do things, if you want to minimize philosophical assumptions! Instead of impredicative definitions, you need predicative definitions. PROP can be easily defined as follows. By recursion, define $F∈ℕ→Str$ such that $F(0)$ is the set of propositional atoms and for each $k∈ℕ$ we have that $F(k+1)$ is the set of formulae that are in $F(k)$ or can be obtained by joining some members of $F(k)$ by one propositional connective. Then let $U$ be the union of $F(k)$ over all $k∈ℕ$. $U$ is the set you want!

If you work within a set theory, you generally would have to construct $U$ via some comprehension axiom, and you can even do so in a single step without any recursion! Concretely:

Let $A$ be the set of propositional atoms.

Define $J(S) = \{ \ x : x{∈}S ∨ ∃y,z{∈}S\ ( \ x = y+z \ ) \ \}$ for any set $S$.

Let $U = \{ \ x : x{∈}Str ∧ ∃k{∈}ℕ\ ∃F{∈}ℕ_{≤k}{→}Str\ ( \ \\\quad F(0) = A ∧ ∀i{∈}ℕ_{<k}\ ( \ F(i{+}1) = J(F(i)) \ ) ∧ x{∈}F(k) \ ) \ \}$
.

Regarding how much set theory you actually need to get all the important theorems of logic, it is actually very little! See this post on building blocks. Note that encoding into subsystems of SOA (second-order arithmetic) require Godel coding if you want to really minimize assumptions, and with that ACA0 can prove the incompleteness theorems and the semantic completeness theorem for countable FOL theories, and but if you do not like Godel coding you can construct a similar system based on TC.

In particular, just as ACA0 is essentially obtained from PA by reifying definitions over PA (as sets), we can define TC2 as TC for strings plus a suitable form of induction plus comprehension axioms that allow you to construct any set of the form $\{ \ x : Q(x) \ \}$ where $Q$ is a property over TC. Due to the inherent nature of strings, we will have a much easier time encoding finite sequences of strings as a single string. See here under "Encoding program execution in a string".

It is also possible to argue that ACA0 or TC2 do not actually involve any set-theoretic assumptions, since at no point do we truly treat the 'sets' as objects. Instead, you can think of each 'set' as simply its defining property, which is just a string! In fact, each of them has a countable intended model matching this view! (The 'sets' in this model are exactly the sets that can each be computed using some finite Turing jump.)

Note that we do not appear to have any uncountable collection in real life, so it is not a big deal that ACA0/TC2 has a countable intended model, nor that they can only prove theorems about countable FOL theories (which include themselves). In short, ACA0/TC2 arguably can "handle propositional and first order logic without using set theory".

If you want to tackle higher logic, such as involving uncountable theories or uncountable models, you will definitely have to have something quite like a full-blown set theory. But the point is that you do not really need set theory for basic results about your chosen foundational FOL theory.