How do you go about formalizing a concept?

1.1k Views Asked by At

I am reading Godel Escher Bach. In the first few chapters, the author shows what a formal system is, and gives examples that eventually lead to a typographical formal system of strings that represents or "captures" the concept of addition or multiplication etc.

I found this fascinating and have never quite thought of formal systems in the way presented.

So I started thinking up things that I could potentially formalize and I came up with a few.

However, I don't know where to start. e.g. Say (and this is hypothetical), I had an observation that most informal conversations between two people eventually lead to talk of sex or toilet humour. How would I formally represent this? Is this even possible?

Please understand that I'm not asking for a way to formalize human behaviour or interaction. Simply a method to attempt to do it.

Any pointers will be helpful.

4

There are 4 best solutions below

3
On BEST ANSWER

There is a wide scientific literature about the use of formal systems to describe human behaviour. In most cases, these formalization techniques have been applied to interactive systems, e.g. to understand the causes of human errors in order to predict their occurrence in human reliability analyses, to explore human-automation interaction and identify mechanisms of task failure by temporal logic, to design systems that favour safe human work in complex, safety-critical systems, and so on.

The idea of applying Hofstadter's system to formalize human behaviour is intriguing. The content of his formal system is substantially typographical in nature, since it is characterized by strings of symbols which lack any inherent meaning. On the other hand, looking at the system from outside, we can establish one-to-one correspondences with real or abstract elements (Hofstadter calls them "interpretations"), incuding those necessary to model human interactions. Hofstadter worked on this interesting field in the years following the publication of GEB, and carried out several studies to apply his theories to human mind and behaviour. For your specific question, I would strongly suggest you to read a book published in 1995 by him and other members of the Fluid Analogies Research Group, entitled "Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought". This is a very interesting collection of papers dedicated to the use of formal systems and computer modeling to describe human mind dynamics. In particular, you could focus on chapter 5 (The Copycat Project: A Model of Mental Fluidity and Analogy-making), where a description of the architecture of the so-called "Copycat" program is provided. Although the final objective of this program (creating a sort of artificial intelligence) is considerably more ambitious and challenging than that of simple formalization, this chapter deals with several important concepts that characterize human cognition processes/ interactions and that must be taken into account to reproduce them in formal models, such as the concept of "analogy making" and that of "parallel terraced scan" (see below). Also, the paper illustrates how any computer model that aims at describing human mind dynamics should include three elements to capture the parallel and random nature of human cognition: a slipnet (a network composed of nodes and links representing permanent concepts), a workspace (an operative working area), and a coderack (set of codelets that can modulate activations in the slipnet and build structures in the workspace).

Taking into account all these considerations, a good method to apply Hofstadter system to human interaction could be to follow these sequential steps:

1 - Define your alphabet, which means the set of symbols that you think appropriate for your purpose. For example, you could decide to choose as possible symbols all alphabet letters from $A$ to $K$, all digits from $0$ to $7$, all capital greek letters, and other symbols (e.g. taken by Hofstadter propositional calculus, such as $\sim$ or $'$, or any other that you believe appropriate);

2 - Define your syntax, that is to say a set of "formation rules" or limitations on what is allowed and what is not allowed in a string (e. g., rules may be "any letter must be followed by an even digit", "two consecutive equal symbols are not allowed", and so on);

3- Moving yourself outside your formal system, attribute a meaningful one-to-one correspondence (e.g. an interpretation) to any element of the system. In this specific case, we could interpret symbols as topics of conversation: for example, we could attribute to the symbol $A$ the topic of weather, to $B$ that of politics, to $C$ that of taxes, to $\alpha$ that of jokes, and so on. In this phase, you could also create subsets grouping topics by categories on the basis of their similarities, and assigning symbols accordingly to facilitate successive analyses (e.g, using $\alpha'$ for politic jokes, $\alpha''$ for professional humour, $\alpha'''$ for toilet humour, and so on);

4 - Decide your "production rules": this is a set of rules that define/predict how strings are progressively built. These rules have to be defined by taking into account the interpretations of each symbol/string (note the difference with the formation rules above, which only state whether a string is allowed or not). The application of these rules should directly allow to obtain a formalization of human interaction. This is the last step and surely the most difficult one: the complexity of rules needed to describe human interactions is largely higher than that shown, for example, in the famous three-symbol "MIU" system used by Hofstadter to illustrate production rules. For this step, I would suggest you to take into account the above mentioned concepts of "analogy making" and "parallel terraced scan". The first one can be defined (using Hofstadter's words) as "the perception of two or more non-identical objects or situations as being the `same' at some abstract level". This is clearly a pivotal factor in determining the sequence of topics within a human interaction. The second one refers to another key aspect of human thoughts, interactions, and conversations, according to which the different ways/possibilities of asking a question, providing an answer or a comment, and changing the topic during an interaction are explored "in parallel" by the human mind in order to provide the most appropriate one. In other words, for each possibility, resources are allocated by human mind in real time according to some feedback about its current promise (whose estimation is updated continually as new information is obtained): these analyses are carried out simultaneously by the human mind, which finally makes its choice and determines the path of the interaction.

Extending these concepts to the formalization of a possible sequence of topics within a human interaction would therefore imply to define a precise sequence of rules that takes into account a number of issues, including similarities between topics, appropriateness and "convenience" of topic changes, individual experience, randomness (another important element of human cognition), and so on. A good way to achieve the last step of our formalization could be to identify a set of functions that predict the probability of each possibility of topic starting, topic changing, and discussion stopping by taking into account those issues. Since probabilities clearly vary among individuals, these functions should be determined by starting from the characteristics of the two subjects involved in the interaction. In this regard, I would suggest you to start by drawing a model similar to those typically used in structural equation modeling (SEM). This is a statistical technique often used in psychological studies, where the relationships between a set of measurable variables (called "manifest" variables) and another set of non-measurable variables (called "latent" variables) are explored by taking into account all possible interrelations between factors (it can be seen as a very general regression analysis). This multiple relationship structure is then graphed using boxes, arrows, double arrows, etc..., and the strength of each association is calculated using specific parameters. In our case, the manifest variables are the characteristics of the two individuals (age, gender, cultural level, socioeconomic class, etc), whereas the latent variables are the probabilities of starting, changing, or stopping any given topic during the interaction. If you have quantitative observational data (i.e., a sufficient number of observations for which the characteristics of the individuals and the topic path are known), you could directly infer a SEM model that includes quantitative parameters. Alternatively, if observational data are not available, you might draw a hypothesis of your model (as typically made in the initial phases of confirmatory analysis) and then try to estimate the strength of relationships on the basis of your experience and previous studies (in this second case, these estimates should be successively validated in a future observational study). In both cases, the resulting model can be used to generate a set of functions that express the probability of starting, changing, and stopping for each topic, given the characteristics of the two individuals involved. As stated above, this function network should also include some component of randomness (there are a number of studies on unpredictability and indeterminism in human behaviour and interactions). Once these functions are defined, their application to our predefined symbols, syntax, and interpretations can directly provide a formalization of that specific human interaction in the Hofstadter style. Also note that, increasing the complexity of these functions and appropriately choosing symbols, syntax, and interpretations, you could also include in the formal description of the sequence of topics additional features of the interaction (for example, duration of each topic, individual that starts/changes/stops each topic, sequence of verbal interventions, length of conversation, and so on).

2
On

Here is an idea: try to produce a grammar where production rules have a probability distribution. You might represent a conversation between two people as a string of topics discussed.

For example the string "ABCBTS" might mean the conversation went from introductions to questions about work to the finer points of metaphysics to more questions about work and finally toilet humor and sex. Related topics have a higher probability of branching back and forth, while unrelated topics may have a very low probability of suddenly springing up into conversation. You could then run a simulation of your grammar and collect statistics about where your strings invariably lead.

2
On

I think Hofstadter's book primarily deals with formal systems AND the way they can be broken — record players and records that can break them. Human beings have a special status in this setting as they are both the greatest record and record player.

To me, what Hofstadter points his finger at is the fact foundations (whatever formalism you may choose) are not sufficient. As a programmer, when I read about progress in physics towards a pancomputationnal theory of the universe, I'm always disappointed efforts are targeted at finding a mathematical basis for the very bricks of the universe (subatomic particles, strings, you name it). Yes, it might allow us to do truly complex, amazing things. But would it make simple things easy ? I doubt it. As a programmer I need a way to manipulate complex concepts as bricks and sometimes I even need to see basic bricks as virtually complex compounds. If I need to get the x of y, is it x.y or x.getY(), or getFromDB("SELECT y from x where x =.."), etc... Semantically, it's all the same, but in terms of implementation, I have to take care of all these details that are quite irrelevant to the domain i'm trying to model but that have to exist because that's how you implement things in a computer. If I'm building a cat renting website, there are chances I'll have to handle quasi-metaphysical dilemmas in my code, state, time, concurrency, that kind of things, even though these concepts never crossed my mind when I first thought up this awesome business idea.

I'll stress two points in Hofstadter's vision :

  • God : there are two ways to look at it : as a destination (a place to stand on) or a direction. This is the latter vision, that is supported in the book. God is what is at the top of the infinite recursion chain. Now if you allow me to blow my trumpet along this metaphor, I'd say that when we're looking for foundations, ultimate theories, the end of science and so on, we're actually in the first mindset. What we get is a record player and we call it god until we find the record that breaks it. What I think we need is a ladder to climb in the recursion chain, a ladder that would allow us to go up and down and integrate the various "foundations" we find on the way.
  • Fractals : there is this picture of a fractal at the beginning of the book, and it's a shame Hofstadter doesn't detail this idea further. It makes me think of Deleuze's "What is philosophy ?" in which the philosopher states that the "plane of immanence is fractal" without explaining it either. Maybe because it's not an idea but a casual observation of the inner workings of the mind.

At last, we've arrived at the mountain lake I wanted you to see : Fractal Patterns in Reasoning - Atkinson, David and Peijnenburg, Jeanne (2011)

This paper is the third and final one in a sequence of three. All three papers emphasize that a proposition can be justified by an infinite regress, on condition that epistemic justification is interpreted probabilistically. The first two papers showed this for one-dimensional chains and for one-dimensional loops of propositions, each proposition being justified probabilistically by its precursor. In the present paper we consider the more complicated case of two-dimensional nets, where each ‘child’ proposition is probabilistically justified by two ‘parent’ propositions. Surprisingly, it turns out that probabilistic justification in two dimensions takes on the form of Mandelbrot’s iteration. Like so many patterns in nature, probabilistic reasoning might in the end be fractal in character.

0
On

I'm a little late to the party, but I work on this stuff and thought I could provide a useful answer for anyone else with the same question as you. My own work is on modelling reality oriented mental simulation, but a simpler example comes from modelling knowledge, so I'll use that.

In case you don't have time to read through all this, here are the summaries I give at the end of each section:

  • 0. A Quick Note. We usually formalise what concepts are about rather than the concepts themselves (e.g., knowledge, not our concept of knowledge).
  • 1. Philosophy: Delineating The Target. The first step is to figure out what we want to model and what it is like.
  • 2. Formalisation: Delineating The Project. The second step is to figure out what kind of formalisation we want (just a formal model, or a proof-system, too?).
  • 3. Building A Logic. The third step is to logically model (i.e., to formalise) the thing we want to model (and to build a proof-system if that's what we want), and then to check that it does what it's supposed to do.

0. A Quick Note

A quick note on the wording of your question. A common misconception is that philosophy (and perhaps other disciplines, too) deals with concepts. Sometimes it does (like any discipline). But most often it deals with the things that the concepts are about rather than with the concepts themselves. For example, just like (some) physicists study gravity rather than our concept of gravity, so too do epistemologists (philosophers of knowledge) study knowledge itself rather than our concept of knowledge. So, whilst it is sometimes concepts that are formalised, more often the formalisation is of the thing the concept is about rather than of our concept of it. A minor point, but worth noting (this topic is discussed at length in Williamson (2007) The Philosophy of Philosophy).

tl;dr: we usually formalise what concepts are about rather than the concepts themselves (e.g., knowledge, not our concept of knowledge).

1. Philosophy: Delineating The Target

We start with epistemology (the philosophy of knowledge), and work out what knowledge is like. This provides us with the thing that we want to formalise (making the analogy to sculpture, this philosophical work gives us a clear picture of the person the sculpture is of, and the logical work is like that of the artist trying to produce the sculpture (or model) of this person). Epistemology is an ongoing project, of course, and we don't yet have a correct conceptual analysis (something like: x is knowledge if and only if x is a justified true belief), but we have worked out quite a few principles, including:

  • ғᴀᴄᴛɪᴠɪᴛʏ. If an agent a knows p, then p (in other words, one can only know something if it is true - if something is false, then, whilst one might believe it, one cannot know it).

In formalising (i.e., modelling) knowledge, we need to encapsulate this epistemic principle. In fully and correctly formalising knowledge, we need to encapsulate all of the correct epistemic principles. And in fully and correctly formalising x (whatever x is), we need to encapsulate all of the correct principles of x.

tl;dr: the first step is to figure out what we want to model and what it is like.

2. Formalisation: Delineating The Project

Two different things might be meant by a 'formalisation of knowledge' (or, indeed, by a formalisation of anything): (i) a formal model of knowledge and (ii) a proof-system for knowledge.

(i) Formal Models. Developing a formal model does not involve developing a proof-system (i.e., a logical system in which one thing can be proven to follow from another, like propositional logic). A good example of this is the formalisation of questions and answers (see Belnap and Steel (1976) The Logic of Questions and Answers). As they note, there isn't much sense to developing a proof system in which one question can be proven to follow from another (for unimportant technical reasons this isn't to do with whether an answer follows from a question). But there is a lot of sense to developing a formal model of questions and answers (and, indeed, of knowledge and many other things, too). Three quick reasons why.

First, it forces us to be extremely precise in our characterisations (which often brings things up that weren't previous noticed).

Second, it allow us to offer an explanation of how a thing works in a mathematically tractable way.

Third, it provides an opportunity for reflective equilibrium; that is, a 'back-and-forth' between the logic and the philosophy, each providing a check, and thus an improvement, on the other through iterative developments of one on the basis of developments in the other. More concretely, the formalisation is built to encapsulate the philosophical principles, which emerge in the formalisation as theorems. The formalisation is good if the formal theorems match up with the philosophical principles. But often times unexpected theorems turn up, and we have to interpret these theorems as philosophical principles and then decide whether we have found a new philosophical principle, have modelled an extant philosophical principle by accident, or have made a mistake. A good example of this reflective equilibrium at play is in the formal work on counterfactual conditionals and the principle of conditional excluded middle (see Stalnaker (1980) 'A Defense of Conditional Excluded Middle').

(ii) Proof-Systems. The second sense in which one might formalise something is in taking a formal model of it (i.e., using (i)) and providing a way of proving whether one thing involving that model follows from another. For example, if we do (i) and come up with a formal model of a proposition, then, since one proposition can be a logical consequence (i.e., follow from) other propositions, we might also want to model that, too.

Here's an example. Suppose that Bob goes to the party, and that if Bob goes to the party then he'll dance. It follows that Bob dances, which is another way of saying that the proposition ❮Bob is dancing❯ is a logical consequence of the propositions ❮Bob is at the party❯ and ❮if Bob is at the party, then Bob is dancing❯. We can build a proof-system for this (though we don't need to for one has already been built: propositional logic). I'll leave the details for below, but the rough idea is that we need a syntax (which gives us a language) and a semantics (which gives us meaning). For the syntax, say we have developed a language for the system, in which 'P' and 'D' are atomic formulae and 'PD' is a compound formula (it doesn't matter what the difference is for now). (Note that, at this stage, the formulae are meaningless within the system, for we haven't defined what they mean in it. But it is useful for us to have in mind that 'P' models ❮Bob is at the party❯, 'D' models ❮Bob is dancing❯, and 'PD' models ❮if Bob is at the party, then Bob is dancing❯.)

Now we need to give these formulae meaning, which is what the semantics does. We define an interpretation of the language as a function from the atomic formulae of the language to either '1' or '0', where '1' models truth and '0' models falsity (though for ease of expression I'll just refer to them as 'truth-values' rather than 'models of truth-values'). (Just in case, a function is a set of tuples (constrained in ways that we don't need to worry about here), in this case a set of pairs ❮x,y❯, where x is an atomic formula and y is a truth-value.) Since there are two atomic formulae and two truth-values, there are four interpretations f of the language:

  • f1 = {❮P,1❯,❮D,1❯}
  • f2 = {❮P,1❯,❮D,0❯}
  • f3 = {❮P,0❯,❮D,1❯}
  • f4 = {❮P,0❯,❮D,0❯}

That's the extent to which atomic formulae have meaning in the language: they're either true or false (on an interpretation).

We then define the meanings of the operators, in this case just '→', which together with the meanings of the atomic formulae give us the meanings of the complex formulae, in this case just 'PD'. We define '→' as a truth function from two formulae of the language, x and y, to a truth-value, such that f(xy)=1 if and only if either f(x)=0 or f(y)=1. For example, f1(P)=1 and f1(D)=1, so either f1(P)=1 or f1(D)=0, so f1(PQ)=1. On the other hand, f2(P)=1 and f2(D)=0, so it is not the case that either f2(P)=1 or f2(D)=0, so f2(PQ)=0. We can build the definition of '→' into the function by constraining the function so that it works in exactly this way. (A quick note. Whatever '→' models (if anything), it isn't the English 'if'. But it's close enough, or rather, it'll do, for our purposes. If you're wondering what does model the English 'if', well, that's still the subject of lively debate. A good introduction to the topic is Bennett (2003) A Philosophical Guide To Conditionals.)

Finally, we define logical consequence as follows: formula $B$ is a logical consequence of set of formulae Σ if and only there is no interpretation $f$ such that f(A)=1 (for every formula A in Σ) and f(B)=0.

That's all we need for our toy proof-system, for we can now prove that D is a logical consequence of P and PD. Proof. Suppose for reductio that, for some f, f(P)=1, f(PD)=1, and f(D)=0. Since f(PD)=1, either f(P)=0 or f(D)=1 [by the definition of '→']. So, since f(P)=1, it follows that f(D)=1. Contradiction [since we supposed that f(D)=0]. So, there is no interpretation f such that f(P)=1, f(PD)=1 and f(D)=0. So, D is a logical consequence of P and PD [by the definition of logical consequence].

So, there are different kinds of formalisations, and we need to decide what kind of model we want.

tl;dr: the second step is to figure out what kind of formalisation we want (just a formal model, or a proof-system, too?)

3. Building A Logic

Once we know what we're modelling and the kind of model we want we can get around to formalisation. The tools needed depends on what's being modelled. The usual framework for modelling knowledge is a possible worlds framework. For present purposes, you can think of a possible world as a set pairs <x,y>, where, for every proposition x, x is a proposition and y is a truth-value. There are an infinite number of possible worlds. One possible world models the actual world, how things actually are. (For a good introduction to possible worlds and possible worlds semantics, see Menzel (2013) 'Possible Worlds', in The Stanford Encyclopedia of Philosophy.)

I'll skip actually building the logic as it isn't needed in full glorious detail for the purpose of answering this question (if you're interested in the different systems that have been built, see Priest (2008) An Introduction to Non-Classical Logic, second edition). But a few details are necessary for this to be illustrative. One model of knowledge is as follows:

  • a knows that P (at a world w) if and only if, for every world w', if wRw', then P is true at w'.

Here's how to interpret this. A world models a way things could be. We can take w to model the way that things actually are (rather than modelling how things might otherwise be), so we can take the above to be a model of a knowing P in actuality. Since a possible world is, for our purposes, just a set of pairs of every proposition and a truth-value, that P is true at some world w'' just means that <P,1> is an element of w'', and if w'' models a non-actual world, then it means that P is possible, and w'' is the possibility (or rather, one of the possibilities) that P is true in. The relation R models a's evidence. Suppose that P is an epistemic possibility given a's evidence (that is, judging from the evidence, P could either be true or false). Since a is at world w, this will be modelled by the relation R being such that, for some world w1 at which P is true and some world w2 at which P is false, wRw1 and wRw2. As such, the model determines that a does not know P. Next, suppose that P is an epistemic impossibility given a's evidence (that is, judging from the evidence, P is false). This will be modelled by the relation R being such that, for no world w1 at which P is true, wRw1. As such, the model determines that a does not know P (and that a knows not-P). Finally, suppose that P is an epistemic necessity given a's evidence (that is, judging from the evidence, P is true). This will be modelled by the relation R being such that, for every world w1 at which P is true, wRw1. As such, the model determines that a knows P (and that a does not know not-P).

As a final point, we can see how the theorems of this system will (or won't, as the case may be) encapsulate the philosophical principle ғᴀᴄᴛɪᴠɪᴛʏ mentioned earlier.

  • ғᴀᴄᴛɪᴠɪᴛʏ. If an agent a knows p, then p (in other words, one can only know something if it is true - if something is false, then, whilst one might believe it, one cannot know it).

Whether or not ғᴀᴄᴛɪᴠɪᴛʏ is modelled by a theorem depends on how we define R. Different systems put different constrains on it, so in some but not others of them, we get the theorem

  • ғᴀᴄᴛɪᴠɪᴛʏ*. For all world, agents, and propositions: if a knows that P (at a world w) then (wRw and P is true at w).

In our system, there's no constraint on a's knowing P (at w) that P be true at w, and so no theorem ғᴀᴄᴛɪᴠɪᴛʏ*. So, either the principle ғᴀᴄᴛɪᴠɪᴛʏ is wrong, or our system is (it's the latter).

There is, of course, a whole bunch more involved here, and this is more than a little over-simplified. But it should at least give a flavour of the sense in which something can be formalised.

tl;dr: the third step is to logically model (i.e., to formalise) the thing we want to model (and to build a proof-system if that's what we want), and then to check that it does what it's supposed to do.