What is a mathematically rigorous treatment of "sentences", "words" and "language"?

140 Views Asked by At

How do mathematicians rigorously deal with words and sentences? In other words, how are words sentences constructed in a mathematical sense? Is there some (accepted) system for constructing words?

I know this is a rather broad question, but I am in the process of trying to think about "words" and "sentences" more rigorously, because often in computational mathematics we deal with sets of words or sets of sentences. These words are then transformed in some ways into the numerical domain to perform calculations on them.

An attempt could be as follows:

To start we may define a set containing all the symbols in a "language system", call it $\mathcal{S}$.

Then a word is just a finite combination of a subset of these symbols. We can either define as a relation $r: \{a_1, \ldots, a_n\} \subset \mathcal{S} \mapsto w$, or perhaps as an element of the power set of $\mathcal{S}$, say $\mathcal{W} := \mathcal{P}(\mathcal{S})$.

Then building upon this, a sentence would be some combination of words, i.e., $\mathcal{P}(\mathcal{W})$ where we define a function that puts the correct punctuations in place.

Just some thoughts. I wonder if people have tried to make this more rigorous.

Note: I have briefly studied regular expressions and formal language (long time ago) but I don't really think we necessarily have to abstract everything to the $0$s and $1$s to make sense of words.

1

There are 1 best solutions below

1
On BEST ANSWER

To my knowledge there is no deep study of such 'language systems' but the study of logic has made quite some progress in that regard. There are multiple ways of formalizing the concept of language systems, like just plain sequences, trees, or other but all this formulations are equivalent.

Model Theory may be what you have in mind, it studies theories and their structures, but to write down the theories axioms it is needed a language. I will give a brief definition of a language and give the example of a language for the theory of natural numbers. A language has:

  1. $n$-placed functions. E.g. the 2-ary $+$ function, the multiplication $\times$ and the successor $S$.
  2. $n$-placed relations. E.g. the 2-ary $<$ less than relation. Or the 1-ary $Odd$ relation.
  3. Constants. E.g. $0, 1, 2,...$ (As you may have noticed, you only need the constant $0$. the rest can be written down using it and the successor function).

Similarly Logic has a language ($\wedge , \forall, \lnot$, etc; though this language is usally taken as a primitive). Now every sequence of these elements is a valid expression, but we only care about well formed formulas. Which are those that satisfy some requirements. E.g. a 2-ary relation may require have a term on each side of it e.g. ($1<2$). Common sense restrictions like that can be used to generate, by recursion, the set of all well formed formulas. The phrase sentence is usually just meant to mean a well formed formula with no variables unquantified. Some nice things about this construction (assuming a lot of stuff I haven't mentioned explicitely):

  1. Unique readability theorem: There is only one way to interpret a sentence. No ambiguities in math.

  2. Effective computation to check if an expression is a well formed formula.

There's a lot more to say but that would make me write a book so I'd just better recommend you a better book in first order logic: I personally like Enderton's A mathematical introduction to logic.