Is a formal language an algebraic structure?

914 Views Asked by At

I am having trouble confirming whether or not a formal language is an abstract structure or algebraic structure.

As far as I am aware, a formal language is a set $W$ of well-formed formulae. On Wikipedia it is written that:

A formal language is an abstract structure, but a natural language is not, because its rules of grammar and syntax are open to debate and interpretation.

But I believe I can show that $W$ is an algebraic structure (also see Wikipedia).


For a countable set $P$ of atomic propositions $\alpha,\beta,\gamma,\ldots$, I could let $P\subset W$ and let there exist a set $\{\neg,\land,\lor,\to,\leftrightarrow\}$ of $n$-ary operators where $n\in\mathbb N^+$ and $n<\infty$. Define:

$$\forall \alpha,\beta\in P,\quad \big(\{\alpha,\beta\}\subset W\big) \to \big(\{\neg\alpha, \alpha\land\beta,\alpha\lor\beta,\alpha\to\beta,\alpha\leftrightarrow\beta\}\subset W\big).$$ Now define a mapping $v:P\longrightarrow\{0,1\}$ such that its relation $R\subseteq P\times\{0,1\}$ obeys: $$\forall\alpha\in P\;\forall p\in \{0,1\}\;\forall q\in \{0,1\},\quad \big((\alpha,p)\in R\land (\alpha,q)\in R\big)\to (p=q).$$

Then we define an extension of $v$ to $v:W\longrightarrow \{0,1\}$ by inductively defining:

$\big(v(\alpha)=1\big)\leftrightarrow \big(v(\neg\alpha)=0\big)$,

$\big(v(\alpha\lor\beta)=1\big)\leftrightarrow \big(v(\alpha)=1\lor v(\beta)=1\big)$,

$\big(v(\alpha\land\beta)=1\big)\leftrightarrow \big(v(\alpha)=1\land v(\beta)=1\big)$,

$\big(v(\alpha\to\beta)=1\big)\leftrightarrow\big(v(\alpha)=0\lor v(\beta)=1\big)$,

$\big(v(\alpha\leftrightarrow \beta)=1\big)\leftrightarrow\big(v(\alpha)=1=v(\beta)\lor v(\alpha)=0=v(\beta)\big)$.

Therefore: there exists a nonempty set $W$, there exists a collection of operators on $W$ with finite arity, and there exists a finite set of identities that the operators satisfy.


By the conclusion above, shouldn't a formal language be defined as an algebraic structure? Or perhaps it is an example of a structure that is both algebraic and abstract.

3

There are 3 best solutions below

8
On BEST ANSWER

Your language is definitely an algebraic gadget. It's context free, and any context free language can be viewed as a kind of "algebra". This paper of Chomsky and Schützenberger (The Algebraic Theory of Context-Free Languages) is a nice (and readable) historically relevant view of the subject, and discusses a tight (and very interesting!) connection between formal languages and formal power series. This is analogous to the connection between a sequence of integers and its generating function, but there's actually something more overtly algebraic you can do. I'll summarize that here, but you can read more in any decent text on universal algebra.

We think of a grammar as telling us which words we can build from pre-existing words. For your example, we have a fairly simple grammar:

  • $x$ is a word for every variable/atomic proposition $x$
  • If $\varphi$ is a preexisting word, $\lnot \varphi$ is a word
  • If $\varphi$ and $\psi$ are preexisting words, then $\varphi \land \psi$, $\varphi \lor \psi$, $\varphi \to \psi$, etc. are words

Now, as you've expected, these form an algebra (indeed a free algebra) which is often called the "algebra of terms" because the elements of this algebra are terms :P.

Formally, we look at the set of all words generated by this grammar, which you're calling $W$. Then we can add operations for each constructor. So for us, there is a unary operation $\lnot : W \to W$, a binary operation $\land : W^2 \to W$, etc. (NB: for general context free grammars, some of our operators might only be partially defined. We can get around this by working with multi-sorted algebras), and while we could stop here, we almost certainly want to consider some of these words "equal". That is, we should probably think of $\varphi \land (\psi \land \theta)$ as the same as $(\varphi \land \psi) \land \theta$, even though they're technically different words. Similarly, we might want to identify the words $\varphi \land (\psi \lor \theta)$ and $(\varphi \land \psi) \lor (\varphi \land \theta)$, since in applications these are usually equivalent formulas. So we can quotient our term algebra (that is, $W$) by the equivalence relation saying that these words are equivalent. It turns out that this is also a computable thing to do, if you're interested in such things, but I won't say more about that here.

Now, this particular algebraic structure (one with operators $\lnot$, $\land$, $\lor$, etc. satisfying some natural axioms) is very well studied. It's called a Heyting Algebra, and what we've built here is the free heyting algebra on our variables! I won't say more here, but you should know that this is an extremely useful construction in mathematical logic. If you have more general context free languages (where we possibly identify words) we again get very useful things, oftentimes "free" structures (free groups can be constructed in this way), and we actually formally describe programming languages in this way!

Lastly, one reason to care about free algebras/term algebras is because it's very easy to define functions out of them. We have this nice recursive definition, so if we have any other algebra of the same type, all we need to do is say what happens to the variables. You've actually partially rediscovered this yourself in your extension of $v$. Since our syntax is our intended semantics, if we have any heyting algebra $H$, any (set theoretic) function $v : \mathsf{Variables} \to H$ extends uniquely to a homomorphism of heyting algebras $v : W \to H$. Indeed, we recursively do exactly what you expect:

  • $v(x) = v(x)$
  • $v(\lnot \varphi) = \lnot v(\varphi)$
  • $v(\varphi \land \psi) = v(\varphi) \land v(\psi)$
  • etc.

I hope this helps ^_^

1
On

But I believe I can show that $W$ is an algebraic structure

This does not make sense. $W$ is not defined as an algebraic structure (at least, not ordinarily; there is a rather clean definition of $W$ as the initial algebra of a certain functor, but this is not the sense in which you're using the term). It is possible to put an algebraic structure on $W$. Saying you can show it's an algebraic structure does not make any sense.

Define: $\forall \alpha, \beta \in P, (\{\alpha, \beta\} \subseteq W) \to (\{\alpha, \alpha \land \beta, \alpha \lor \beta, \alpha \to \beta, \alpha \leftrightarrow \beta\} \subseteq W)$

This is not a definition of anything, so I have no idea what this statement means.

Now define a mapping $v : P \to \{0, 1\}$ such that its relation $R \subseteq P \times \{0, 1\}$ obeys: $\forall \alpha \in P \forall p \in \{0, 1\} \forall q \in \{0, 1\}, ((\alpha, p) \in R \land (\alpha, q) \in R)) \to p = q$.

You did not define any particular mapping $v$. The condition that you are demanding $R$ satisfy is part of the definition of being a function, so I am confused as to why you included it here.

As for your definition of extending $v$ to $W$, this is the standard way of defining the semantics of the language. But the semantics of a given language are distinct from the syntax of the language. The language is defined by its syntax and is given meaning through its semantics. There can be more than one way of giving semantics to a given syntax.

0
On

When one creates a language, there are certain steps to be taken.

The first step is to provide an alphabet for the language, i.e. the set of the symbols that we are allowed to use in order to form words, in your case, the set of variables and operators.

Next, we need a grammar for the language, which we can think of as a way to distinguish words from nonsensical sequences of symbols. In your case, that corresponds to the (recursive) definition forming well formed formulae.

Next, and this step is implicit in your construction, is to provide a syntax, a way of forming sentences from words. You can read more about syntax, a good starting point is "natural deduction systems with double negation elimination".

Last, but definitely not least, is providing (semantics) "meaning" for our newly formed language. In your case, the algebraic structure you provide plays the role of semantics. It is called two-valued Boolean algebra.

Making the distinction between what is syntactical and what is semantics answers your question. The thing that is confusing you is that the language you have created appears to BE an algebraic structure. That is because this language is sound and complete. A sound language is one where everything syntactically correct has meaning. A complete language is one where everything that has meaning can be syntactically represented. So in a sound and complete language, meaning and syntax are two sides of the same coin, sort of.