Define "first-order language" without using it

473 Views Asked by At

I am struggling to define "first-order language." In principle, a first-order language is any language produced by a "first-order grammar," but there doesn't seem to be any way to state what a "first-order grammar" is without employing the use of first-order logic at some point. In particular, I can't seem to reduce...

if $F$ is an $n$-ary function symbol and $t_1,\ldots,t_n$ is a list of $n$ terms, then $F(t_1,\ldots,t_n)$ is a term

...and...

if $R$ is an $n$-ary relation symbol and $t_1,\ldots,t_n$ is a list of $n$ terms, then $R(t_1,\ldots,t_n)$ is a formula

...to production rules. This is a problem because "is $n$-ary" is a proposition which must be proven - leading very quickly to circularity. I am not comfortable pretending that an infinitely long list of grammars is contained within any medium currently available (if you happen to have developed some means of encoding an infinite amount of information, please let me know). So what is a first order language, and how do I know that $\mathcal{L}$ is first-order without knowing what a "number" is?


Clarification

I understand quite well the intended meaning of "first-order language." My issue is that I see no way of conveying this meaning in a purely mechanical, finitistic way. It is not possible to commit "first-order languages" to writing, because the totality of "first-order languages" is infinite - the best we can hope for is a finite set of instructions for generating arbitrary finite subsets of arbitrary first-order languages. In reality, this is all any author has ever actually done.

The reason I assume these instructions to take the form of a formal grammar is that this is how first-order logic is introduced. In identifying the class of first-order theories, authors often introduce a formal grammar (sometimes in the form of "inductive definitions," which are just production rules stated in English) to describe what a first-order language is.

The entire problem is easily resolved using conditional rewriting rules in place of a grammar - but I have never seen anyone do this.


For reference:

Bell and Slomson Models and Ultraproducts: An Introduction

Ben-Ari Mathematical Logic for Computer Science

Pudlák Logical Foundations of Mathematics and Computational Complexity

Weiss Fundamentals of Model Theory

Wikipedia First-order Logic

A few other books that I haven't read in a while go through the "from language to logic" process as well, and I will add them as soon as I find them.

1

There are 1 best solutions below

0
On

Not only can the complete syntax of first order languages be defined in terms of a single grammar, this grammar is context-free. As a consequence, whether or not a given string is a formula in a first order language is decidable. The grammar is as follows...

$$\begin{align} &\langle formula\rangle&\to&\qquad\neg\langle formula\rangle\\ &&&\qquad\mid(\langle formula\rangle\langle connective\rangle\langle formula\rangle)\\ &&&\qquad\mid\langle quantifier\rangle\langle variable\rangle(\langle formula\rangle)\\ &&&\qquad\mid\langle relation\rangle\langle args\rangle\\ &\langle relation\rangle&\to&\qquad\langle relation\rangle_\text{I}\mid R\\ &\langle args\rangle&\to&\qquad ^\text{I}\langle args\rangle\langle term\rangle\mid\varepsilon\\ &\langle term\rangle&\to&\qquad\langle variable\rangle\mid\langle function\rangle\langle args\rangle\\ &\langle function\rangle&\to&\qquad\langle function\rangle_\text{I}\mid F\\ &\langle variable\rangle&\to&\qquad\langle variable\rangle_\text{I}\mid v\\ &\langle quantifier\rangle&\to&\qquad\exists\mid\forall \end{align}$$

...using $\langle formula\rangle$ as the start symbol, and $^\text{I},_\text{I},(,),\varepsilon,F,R,v,\exists,\forall$ as terminals. It is acceptable to exclude the connectives here solely because the set of functionally complete logical connectives is finite. It is possible to create a list of grammars for each set of logical connectives within a finite space, and I will do so if asked.

The language generated by this grammar includes all syntactically valid first-order formulas, and thus suffices to define the notion of "first-order languages." Note that constant symbols - and, incidentally, propositional variables (which is something that I did not expect) - are accounted for by $0$-ary function and relation symbols, respectively. The arity of a function or relation symbol is explicitly stated by the tallies in the upper index, while the tallies in the lower index serve to distinguish function/relation symbols of the same arity - e.g. $F_\text{II}^\text{III}$ is $\text{II}^\text{th}$ ($2^\text{nd}$) $\text{III}$-ary (3-ary) function symbol.

The resulting formulas necessarily match the arity of each function/relation symbol with an argument of the appropriate length. For example, the formula...

$$\exists v(\forall v_\text{I}((R_\text{II}^\text{II}vF^\text{II}v_\text{II}v_\text{IIII}\land\neg R_\text{I}^\text{II}v_\text{I}F_\text{III})))$$

...("$\exists x.\forall y.R_2(x,F(u,v))\land R_1(y,c)$, where $R_1,R_2$ are binary relations, $F$ is a binary function, and $c$ is a constant" in commonspeak) is well formed, while...

$$(\neg R_\text{I}^\text{III}F_\text{II}v\lor\exists v_\text{II}(R\land R^\text{II}v_\text{II}v))$$

...("$R_1(c,x)\implies\exists y.\varphi\land R(y,x)$, where $R_1$ is a ternary relation symbol, $R$ is a binary relation symbol, $c$ is a constant, and $\varphi$ is a propositional variable" in commonspeak) is not.

It is worth noting that the arity of a function/relation symbol is defined in terms of the list of arguments, instead of being "bound" to a specific function/relation symbol. This is actually rather profound - we cannot define a grammar where the arity of a function/relation symbol occurs in the righthand side of a production rule independently of the arguments; productions of the form...

$$\begin{align} &\langle formula\rangle&\to&\qquad\langle relation\rangle\langle args\rangle\\ &\langle term\rangle&\to&\qquad\langle function\rangle\langle args\rangle\\ &\langle relation\rangle&\to&\qquad\langle relation\rangle_\text{I}\mid\langle relation\rangle^\text{I}\mid R\\ &\langle function\rangle&\to&\qquad\langle function\rangle_\text{I}\mid\langle function\rangle^\text{I}\mid F\\ &\langle args\rangle&\to&\qquad\langle args\rangle\langle term\rangle\mid\varepsilon \end{align}$$

...where the nonterminals $\langle function\rangle$ and $\langle relation\rangle$ are replaced with function/relation symbols of a particular arity, yield invalid formulas. In other words, "arity" if it is to be considered a part of syntax, cannot be considered a property of function/relation symbols themeselves.