I am reading "Extensions of First Order Logic" by Maria Manzano (1996). It develops the thesis that
"[M]ost reasonable logical systems can be naturally translated into many-sorted first order logic. ... All the logic systems treated in this book are put in direct correspondence with many sorted logic, because this logic offers a unifying framework to place other logics. ... Currently [in 1996], the proliferation of logics used in philosophy, computer science, artificial intelligence, mathematics and linguistic makes a working reduction of this variety an urgent issue."
-- Maria Manzano, Extensions of First Order Logic
Interesting, but it's 22 years old! What is the status of this thesis now? Do we have many logic systems that are used in practice, say in computer science, foundations of physics, etc. that cannot be translated into many-sorted logic. Also, does this thesis say that we could also directly use many-sorted logic in these applications or is it that a lot is lost in the translation in terms of convenience. Any reference would be appreciated. Are there more recent books that followed up this thesis? BTW, I have just read the first three chapters and like this book very much. I would love to see the same in a more recent book, perhaps with a different angle.
Here is a link to a pdf document "Eight European Summer School in Logic, Language and Information Reader of Course: Extensions of First Order Logic" that basically contains paste and cut sections of the book. The book has 350 pages, whereas this document has 90 pages, but it maintains the same thesis. In fact, the Introduction is a paste and cut.
Here are some related questions: Henkin vs. "Full" Semantics for Second-order Logic and Multi-Sorted First Order Interpretations, https://mathoverflow.net/questions/105234/second-order-term-in-first-order-logic, advantage of first-order logic over second-order logic .
NOTE ADDED to clarify the question: Here is a quote from the book which provides some idea about the arguments that support the thesis:
Throughout the pages of this book you will find good reasons for wondering whether the philosophy of standard structures is the only possible choice. The reasons are directly related to the following questions:
- Are we satisfied with the limitation on the class of models they require? Would it not be highly instructive to discover new and sensible models for a known existing theory?
- Don’t we feel uneasy about crossing the border with set theory? Don’t second order validities refer to the given set-theoretical environment instead of the logic in itself?
- Do we need all the expressive power they provide?
-- Maria Manzano, Extensions of First Order Logic
The last two (rhetorical) questions point toward the idea that the goal of a logic should not be to talk in details about the background set theoretical environment, the mathematical universe. In other words, a so called interpretation in the mathematical universe is not necessarily the "practical" interpretation that concerns us and thus we should not care about the extra "expressive power" of standard SOL. If, given this extra flexibility, we accept Henkin semantic and thus give up on the (supposedly in practice inexistant) extra expressive power of standard SOL, then, as mentioned in many text books, we have:
In effect, using general pre-structures amounts to treating a second-order language as a many-sorted first-order language.
-- Enderton, Herbert B., in "Second-order and Higher-order Logic", The Stanford Encyclopedia of Philosophy (Fall 2015 Edition), Edward N. Zalta (ed.), URL = https://plato.stanford.edu/archives/fall2015/entries/logic-higher-order/.
Because this last point is well known, it is not the subject of this question. The thesis that is the subject of the question is whether we are OK in practice to give up the extra expressive power and other semantic properties of standard SOL? In that sense, is really many-sorted logic a unifying logic?
SECOND NOTE ADDED. Here is an excerpt of the abstract of a paper that was suggested to me by @Carl Mummert to find the answer:
This paper aims to outline an analysis and interpretation of the process that led to First-Order Logic and its consolidation as a core system of modern logic. We begin with an historical overview of landmarks along the road to modern logic, and proceed to a philosophical discussion casting doubt on the possibility of a purely rational justification of the actual delimitation of First-Order Logic.
-- Ferreirós, José. (2001). The Road to Modern Logic-An Interpretation. Bulletin of Symbolic Logic. 7. 10.2307/2687794.
THIRD NOTE ADDED. Indeed, thanks to @Carl, the issue was appropriately discussed in this article. Here is an excerpt that gives the idea:
Skolem’s conference, obscurely published as a paper in 1923, was a masterpiece of clarity and rigorous argument. The only point that is not clearly argued is, unfortunately, why (as he asserts) axiomatization requires a restriction of the quantifiers to the first-order level.
... if we are interested in producing an axiomatic system, we can only use first-order logic. I interpret this to mean that, in his view, the spirit of axiomatics—in the tradition of Pasch, Dedekind, Peano, Hilbert— can only be consistent with the use of FOL as an underlying logic. Let us see how this interpretation can be justified.
... the requirement [for a rigorous derivation of a body of theory] was recognized as equivalent with a principle of independence from meaning, and ultimately with the principle of free realizability of the axiom systems — the freedom, that is, to regard completely different object-domains as models of the system.
... reading the second-order quantifiers as referring to “any (all) class(es)” of objects in the domain, can we assume arbitrary classes, arbitrary subsets of the domain, or not? The former would be consistent with abstract mathematics but, by taking arbitrary subsets to be validated by logic, we would be moving in circles—preempting the desired result of securing with absolute certainty foundations of abstract mathematics.
This is an interpretation given in 2001 of a 1923 argument of Skolem, which, the article says, has been misunderstood! This is a strong interpretation in support of Henkin general semantic, because the standard semantic is only one particular restricted meaning, the one associated with the background mathematical universe. This is basically the thesis given five years before by Manzano, but also seventy eight years before by Skolem and perhaps often in between.
FOURTH NOTE ADDED after the answer of @user21820. Because of the three previous notes added, the emphasis shifted to a comparison between second order logic and first order logic and the related issue of expressive power. However, before these added notes, every thing in the question, especially the first paragraph, was about the possibility to translate efficiently other logics, not only SOL, into many sorted logic. Much of Manzano's book is about the translation of other logics into many sorted logic. This is the main way that she defends the thesis that it is a unifying logic. Still, it is necessary to clear out the confusion created by the different notions of "expressive power" involved in the comparison betwen FOL and SOL so that the answer can focus on the main question. However, because this would take too long as a comment or an added note (to express what is already understood), I provided my answer to that aspect of the question in the list of answers below.
It is actually misleading to think of SOL with full semantics as 'allowing talk' about the background mathematical universe. This is because full semantics is really and truly a semantic notion totally separate from the (syntactic) second-order system itself. I think it is important to maintain a very clear distinction between syntax and semantics; a system cannot talk about any interpretations of itself. Even if we can interpret it to be doing that, it is still just an interpretation, and not what it is doing. In other words, the symbols are meaningless until we interpret them. You can choose to interpret an SOL system using full semantics, or you can choose to use Henkin semantics, and neither choice has anything to do with the system itself; it generates the same theorems.
Furthermore, the formal system meant for foundational purposes has to be one with a computable proof verifier (or equivalently theorem generator). So it would make no sense to ask for a formal system that is complete with respect to SOL with full semantics, since any reasonable foundational system must be able to reason about finite program runs, which by the generalized incompleteness theorems is incompatible with full-semantics SOL, since PA2 (second-order PA) is categorical but the theory of $\mathbb{N}$ is uncomputable.
This means that if we are interested in full-semantics SOL it cannot be for foundational purposes, and so must simply be a mathematical notion residing in our meta-system (MS). Of course in MS we can discuss formal systems as objects, but for the reasons above, MS cannot be based on any formal system that is complete with respect to full-semantics SOL, and so necessarily no system captured by an object in MS can pin down the 'true' natural numbers, because the mathematical notion $\mathbb{N}$ itself is merely an object in MS, very incompletely pinned down by MS itself. Same goes for anything that appears to be essentially uniquely defined by some second-order system under full semantics.
As explained above, one cannot consider full-semantics SOL to have more "expressive power" in the specific sense of being able to express more, since it does not. Of course, MS may assume (or prove) the existence of a full-semantics model of PA2, and may further prove that this model is essentially unique. But (in my view) it is not full-semantics PA2 that is more expressive than (first-order) PA; it is MS that is more expressive!
To clarify my objection to the use of the word "expressive", consider what you think PA2 is really expressing. Its meaning under full semantics can only be defined with respect to the collections that exist in MS, and what collections exist in MS are obviously governed by MS. So it is misleading to think of PA2 as expressing anything on its own since its meaning is tied to what MS can express.
In contrast, the notion of models of an FOL system is quite independent of MS. Although the domain still must be a collection in MS, and each symbol must be interpreted by some function or relation in MS, the truth-values of sentences in each model is fixed, independent of MS (as long as you agree that collections, functions and relations are meaningful independently of MS). Similarly with Henkin models of an SOL system. This is not so with full-semantics models of an SOL system, because the collection of sets must be exactly the powerset of the collection of individuals, where "powerset" must exactly coincide with what MS 'thinks'. Clearly then we are importing all the assumptions of MS into the notion of a full-semantics model.
So (as per above) it does not make sense to say that full-semantics SOL had any extra expressive power over FOL at all. It is analogous to claiming that we can get more by assuming that $\mathbb{N}$ is a standard model of PA than by just assuming it is a model of PA. Of course, but this assumption is made in the MS, not in PA.
In the same manner, if we are building a foundational higher-order formal system, we should reasonably want it to be complete for Henkin semantics, because that captures more or less what second-order objects we are certain necessarily exist (since the comprehension axioms correspond exactly to the definable subsets over the first-order objects). Thus a formal system that is complete for Henkin semantics is not only more general (unifying) than one for full-semantics, but will also be more robust.
For example, any reasonable MS can prove (via the incompleteness theorems) that if MS itself is consistent then MS has a model with a non-isomorphic $\mathbb{N}$. So it means nothing for second-order PA to be categorical under full semantics. In contrast, there is no 'illusion' of categoricity for Henkin semantics.
Note that what I wrote is based on my interpretation of the English words "expressive" and "talk" and "unifying" and so on, and other logicians may not have the same notion in mind when they use the same words. But anyway the mathematics is the same; MS knows that for every model of itself there will always be another model that disagrees on full-semantics models of PA2 even though it is categorical, and that is the core of what I am getting at when I say that full semantics gives an 'illusion' of expressiveness. It is not that Henkin semantics allows you to 'express less', but rather it does not give the impression that you can pin down structures that cannot be pinned down. Hope it is clearer. =)
And now I'm going to say something about translation of practical formal systems into FOL.
Technically speaking, every possible practical foundational system $S$ can be translated uniformly and easily into FOL. This is because (as stated above) it must have a computable proof verifier, and so even a weak MS such as TC (theory of concatenation) or PA will be able to 'trace' the finite runs of the verifier and hence capture which statements are proven by $S$.
Specifically, there is a $2$-parameter sentence $V$ over TC such that for every strings $p,x$ we have that MS proves $V(p,x)$ iff $p$ is a valid proof over $S$ of statement $x$. (The same holds for PA but we of course have to encode strings say using Godel numbering, whereas the encoding is trivial for TC.) Note that MS is a first-order theory, and so we have effectively translated $S$ into a first-order system, since the theorems of $S$ can be recovered directly from the theorems of MS of the form $V(p,x)$. (Note that this depends on the soundness of MS, but almost every logician accepts the soundness of TC and PA.)
Such a translation is not only uniform for all practical formal systems but also easy to perform. However, the catch should be obvious; it is a purely syntactic translation. It fulfills the stated goal of representing any target system within a first-order system, but you may ask whether there is a 'natural' translation that also 'captures' the intended semantics. For example, you might want (what are intended to be interpreted as) collections in $S$ to also be something like collections in the target first-order system $T$.
But what does that actually mean? In the first place, the notion of "collection" will have to be a meta-theoretic notion, namely a concept in MS, otherwise the goal would be no more meaningful than a syntactic translation. Then the answer is no; there cannot always be 'natural' translations, because the intended semantics for collections in $S$ may well be incompatible with the intended semantics for collections in MS. For example, if MS is ZFC and $S$ is NF, then it is impossible for $U := \{ x : x=x \}$ in NF to be a set in ZFC with the same members, since $U∈U$ in NF, contrary to Foundation in ZFC. Although NF is a classical first-order theory, you can imagine non-classical variants of NF that still admit a universal 'set'. The point remains that you cannot expect to be able to 'naturally' translate any given arbitrary $S$ uniformly into FOL, without it degenerating into a syntactic transform.
Another way to put it is that 'natural' translations are necessarily ad-hoc. For example, you can 'naturally' translate HOA (higher-order arithmetic) into ZFC because we can construct iterated powersets of $\mathbb{N}$ in ZFC, and you can 'naturally' translate plenty of stratified type theories (such as CiC) into ZFC plus some large cardinal (e.g. CiC into ZFC plus countably many inaccessibles), because the stratification ensures that the intended semantics are compatible.
Moreover, I would say that it is impossible to 'naturally' translate even some 'pure logics' into FOL, such as provability logic. This is because the modal operator allows us to drop quotation symbols, which in the first place cannot be represented natively as first-order symbols, since quotations are essentially meta-operators. Observe that the PDF you linked to essentially constructs Kripke frames to translate the modal logics mentioned. This is clearly an instance of importing capabilities of MS, and not a 'natural' translation. It also only works because we have proven that those modal logics correspond to some first-order frame condition! If you actually perform that given translation on provability logic and try working in the resulting first-order system, I am sure you will be forced to agree that it is not 'natural'.
As for foundational systems that are applied (in computer science or physical sciences), the whole purpose of designing stratified type theories was to make them somewhat philosophically justifiable (and hence hopefully not suffer from arithmetical unsoundness). Also, as far as we know, almost every mathematical theorem with direct real-world applications can be proven naturally in HOA (especially if you have the base types $\mathbb{N}$ and $bool$ and you have the type $S→T$ for every types $S,T$, rather than just one type for each powerset iterate of $\mathbb{N}$). This is of course trivial to express as a many-sorted FOL system, with one sort for each type.
It is true that HOA is unable to prove things about arbitrary types, and that perhaps is the most convincing reason we should go beyond it, if only for conceptual 'completeness'. Unfortunately, as Russell's paradox shows, we cannot possibly have both boolean set membership and unrestricted set comprehension. ZFC discards the latter. But since practical applications almost solely involve 'concrete' objects (below some finite order in HOA), it is unlikely that future proof assistants will be based on radically different foundational systems, since mathematicians will not give up HOA, and it is sufficient for 'concrete' mathematics.
Currently, as far as I know all major proof assistants (such as those listed on Wikipedia) are indeed based on either a classical set theory like ZFC plus some large cardinal axiom, or a stratified type theory like CiC. For instance, Mizar is based on ZFC plus Grothendieck universes, and Coq is based on CiC. This at least implies that work done in established proof assistants can all be 'naturally' translated into a suitable first-order system. This is subjective, but you would have to try it yourself to form your own opinion.
Finally, I will note that many apparently modal logics that are widely used in applied computer science are actually easily translatable into FOL. For example, temporal logic is easily expressed using an axiomatization of time and events in time instead of using modal operators for tenses. Sure, it will be less elegant, but it would be far more general and also fit nicely into standard foundational systems.