When is a proof or definition formal?

4.4k Views Asked by At

When is a proof or definition formal?

I've been searching for an explanation of when or proof or definition is formal.

Sometimes, authors call their proofs or definitions informal without further explanation.

I've an idea that a formal proof is derived directly from the axioms (with references to already proven statements), or are the requirements less strict?

When is a definition formal? May words whose meaning follow only from the context be used in a formal definition, or should each use of a word be explicitly defined?

8

There are 8 best solutions below

1
On

It's not a very clear-cut line. There's also a complication, because mathematicians often use certain styles of language in a formal proof, making the proof sound more formal-in-the-colloquial-English-sense. They do this to signal "this is a formal-in-your-sense proof". The converse is also true.

Here's an informal intensional definition:

A formal proof is basically one where nothing is hidden.

(The definition of "nothing" is flexible, depending on the level of the proof. A formal proof aimed at world-class mathematicians may miss out more details than a first-year undergraduate formal proof, which is not allowed to miss out many details at all.)

One may use words whose meaning follows only from context, as long as it's clear what the meaning is; that's not hiding anything. A proof can skip out details and still be formal, as long as it tells you how to complete the details: that's hiding stuff but making it clear that the stuff is a) hidden, and b) easy to recover.

Here's an informal extensional definition.

A proof which leaves large chunks to the reader is informal: it hides large amounts of detail. A definition which misses out some annoying special cases (perhaps to make the statement more slick) is informal. A proof which pedantically goes over every statement is formal. A proof in which every statement clearly follows from earlier statements or from axioms/hypotheses is formal.

0
On

If you make a statement in mathematics, the question is, can you put the statement exactly in the mathematical framework, ie can you convey your thoughts mathematically so as not to lose any details. As an example say we take the limit definition. I make a statement as follows. "The value of a real valued function $f$ gets closer and closer to $L$ as $x$ gets closer and closer to $a$." Now, can we write this "formally" in mathematical languaage ? Of course we can. $\forall \: \epsilon>0, \exists\:\delta>0 \ni 0<|x-a|<\delta\implies|f(x)-L|<\epsilon$. More precisely $\lim_{x\to a} f(x)=L$. So we have translated an informal statement into a precise mathematical definition.

In case of proofs as well the situation is similar. The idea is to capture mathematically every possible details that are necessary in proving the result.

4
On

Personally, when I think of a formal definition I think of a concept explained with mathematical language instead of natural language.

For example, when we define the limit of a function $f$ at a point $a$ we can define it with both of these forms:

Definition 1: We call $l$ the limit of a function $f: R \rightarrow R$, if $$(\forall \epsilon \in R^+)( \exists \delta \in R^+)( 0<|x-a|<\delta \implies |f(x) - l| < \epsilon)$$

Definition 2: We call $l$ the limit of a function $f$ at $a$ if it is the number that the function tends to near $f(a)$ (if it exists).

Definition 1 is a formal definition, with mathematical language and citing all the necessary conditions in order to define the concept. Definition 2 lets the user imagine intuitively the concept but don't let the user work with the concept because he doesn't have the mathematical concept in order to deal with proofs.

Note that in both definitions we are defining the same concept, but the first one allows us to work with the definition avoiding the ambiguity - nevertheless, it is very easy to understand the concept in the second definition.

1
On

The problem here is that the answer depends on whether "formal" is formally or informally defined:)

If it is formally defined it would probably be defined as a mathematical term and the requirement of the proofs and definition would be quite strict. They would probably need to be formulated in a formal language adhering to certain rules. Normally one does not see these kind of definitions and proofs.

Otherwise the term "formal" would have a more relaxed meaning and one would not require a formal definition or proof to adhere to some very strictly defined standards. In this case one would mean that formal means more strict formulation than informal (which is more sloppy formulation) - where you draw the line is depending on context.

Also which one is used can also be seen in context. If you don't see any formal language (note however that a formal language can look natural) or formal definition of "formal" it would most probably be used in the second meaning.

6
On

A formal proof is one where you don't attach an interpretation to what you write, and all steps are "mechanically" justified by explicit pre-established rules, be them axioms or other theorems.

For instance, when writing

$$a\ne0,ax^2+bx+c=0\\\iff\\ x^2+\frac bax+\frac ca=0\\\iff\\ x^2+2\frac b{2a}x+\frac{b^2}{4a^2}=\frac{b^2}{4a^2}-\frac ca\\\iff\\ \left(x+\frac ba\right)^2=\frac{b^2}{4a^2}-\frac ca,$$

the first step is justified by the rule of division of an equality and by distributivity/associativity, the second by addition of a term to an equality and the third by a remarkable identity.

At no time do you have to think that the symbols represent numbers and the operators arithmetic computations. You apply the available rules and transform the expressions. So you are looking more at the patterns than at the content. The goal is to reduce human error.


This doesn't mean that the proof must be written in a symbolic form rather than as a narrative explanation. What matters is that all steps are clearly justified by a previously established truth. In practice some obvious justifications can be omitted, provided the intented readers can retrieve them.

5
On

There is actually a whole spectrum of "formality" in mathematics. In informal terms, "formal" it refers to what is considered as rigorous, but that is of course subjective.

  • Absolutely formal: Written in a language that can be verified by a program that implements some formal system. Check out Mizar and Coq, two well-known proof assistants based on two different foundational systems for mathematics.

  • Extremely formal: Written in a manner such that any logician is capable of translating it by hand into an absolutely formal version. For example, "$\forall x \in S\ ( P(x) )$" where "$P$" is a predicate can be written as a short-form for "$\forall x\ ( x \in S \to P(x) )$", where the latter is permitted in pure first-order set theory.

  • Obviously formal: Written in a manner such that all logicians will agree can be translated mechanically into an absolutely formal version. It may well be impractically troublesome to actually do it, but nobody doubts that it can be done. For example, consideration of symmetry such as in "Without loss of generality $a < b < c$, ..." can be done whenever all the previously deduced statements are symmetric under permutation of $a,b,c$ and $a,b,c$ come from a total order. The easiest way to absolutely formalize it would be to simply repeat the subsequent argument for each permutation, but it is prohibitive especially if there are a lot of symmetry considerations!

  • Provably formal: Written in a manner such that any logician can prove that it can be translated mechanically into an absolutely formal version. This is where it starts to get hairy, because one logician might use some meta-system MS to prove this, but another skeptical logician might reject some inference rule or axiom of MS that the first logician used. So the first logician might insist that an absolutely formal proof exists, but the second might insist that it does not exist. For example, ZF set theory proves that the Hydra game terminates. In particular the following play terminates: At step $n$, if the tree is just a single node then the game ends, otherwise we remove the right-most leaf node and then if its parent is not the root we duplicate the resulting parent subtree $n$ times. This holds even if the initial tree has depth $100$, and ZF additionally can prove that a weaker formal system called PA can prove it. But the shortest proof of this in PA will have more deductive steps than the number of particles in the observable universe! So a logician who rejects ZF but accepts PA might not believe that an absolutely formal proof exists...

  • Reasonably formal: Written in the standard of modern mathematics, that is, in sufficient detail to convince experts in that field of mathematics that it is rigorous and correct. Lots of steps may be skipped, and yet it may be considered perfect. For example, one may define the diameter of a finite graph with weighted edges to be the largest distance between two nodes in the graph, where distance between $x,y$ is defined as the minimum possible sum of weights along a path from $x$ to $y$. It is clear to any graph theorist that this is well-defined, even though technically one has to prove the implicit claims of existence of "minimum" and "maximum" here; it is not totally trivial!

  • Somewhat Informal: Written (or more often spoken) to fellow mathematicians in a manner that is not even reasonably formal, especially when diagrams are involved. You will find less diagrams in more formal proofs, simply because nearly all diagrams require some amount of intuitive interpretation by the intended audience to determine what the diagram is supposed to illustrate, and diagrams almost always depict a single instance and not the general case. Good diagrams will depict a non-trivial instance while bad diagrams may actually mislead viewers to make false assumptions! Also, informal mathematics often arises together with phrases like "it can be shown" and "this will be left as an exercise for the reader"...

  • Very informal: Done with vague words like "near" and "close" in analysis. This is sometimes sufficient to convince a fellow expert in the same field, but it is usually insufficient to enable mathematicians not in the same field to actually write down a formal proof without much effort. Worse still, informal proofs by non-mathematicians tend to be wrong, but often the flaws are not seen by other non-mathematicians.

4
On

There is one aspect of "definitions" that is not exactly asked in your question, but is relevant to mathematics. There are actually two kinds of completely formal definitions, arising from two separate mechanisms: $ \def\nn{\mathbb{N}} $

  1. Definition by existential instantiation: When we have an axiom or a deduced sentence asserting the existence of some object, namely "$\exists x\ ( P(x) )$" where $P$ is some predicate, we can instantiate it and name a reference to it, by saying "Let $c$ be such that $P(c)$.". This is far more general than you might think. For example when you define a function $f : \nn \to \nn$ such that $f(0) = 0$ and $f(n+1) = 2f(n)+1$ for every $n \in \nn$, what you are actually doing is to invoke a theorem that says "$\exists f : \nn \to \nn\ ( f(0) = 0 \land \forall n\in\nn\ ( f(n+1) = 2f(n)+1 ) )$". Don't laugh! This is not necessarily a trivial theorem, depending on your choice of foundational system for mathematics. The bottom line is, such definitions are only valid if you can prove the existential statement that guarantees the existence of the object that you are giving a name to.

  2. Definitorial expansion: In some foundational systems, we are unable to use the above type of definition, simply because we cannot prove the required existential statement. For example in ZF set theory one can actually prove that there is no set $S$ whose members are exactly the sets that are not members of themselves. Namely, $\neg \exists S\ \forall x\ ( x \in S\ \Leftrightarrow \neg x \in x )$. But we still can define a predicate $R$ where $R(S) \overset{def}\equiv \forall x\ ( x \in S\ \Leftrightarrow \neg x \in x )$ for every set $S$. $R$ is not a set, but we can still talk about sets that satisfy $R$. For example, $R(\{\})$ because the empty-set is not a member of itself. This notion of definition can itself be formally defined as described in this post, and more details about the issue with Russel's paradox can be found in this post. This is also more common than you might think. For example in ZFC set theory, there is no function that maps every set to its cardinality, and so the cardinality notation in "$\#(S)$" or "$|S|$" cannot be defined using the first type of definition. But it can be defined via this type of definition. Similarly, when you define the diameter $diam(G)$ of every finite graph $G$, note that actually $diam$ is not a function in ZFC set theory, because there is no set of all finite graphs! But the notion of "finite graph" and $diam$ are both definable via definitorial expansion, and we can use $diam$ on any finite graph. Things get a bit messy with higher-order things like the family of functions on finite graphs. That cannot even be defined via definitorial expansion, but we can cheat in many cases. For example we can define even by mere existential instantiation the family of all functions on graphs whose vertices are a subset of $\nn$.

0
On

An formal statement/proof assumes that the problem structure is "sufficiently regular" for the conclusion to hold, and merely explains how to derive the conclusion assuming the unstated assumptions hold true.

Example of an informal statement:

The sum of an infinite geometric series with ratio $r$ is $$\sum_{k=0}^\infty r^k = \frac{1}{1-r}$$

This statement is informal since it is unclear whether this definition applies to some $r$ or all $r$; it does not explain whether it is intended to ascribe a sum to divergent series (sometimes possible).

A formal statement/proof is one that states all assumptions necessary for its conclusions to hold, except perhaps those that are blatantly obvious to the average audience.

Example of a formal statement:

The sum of an infinite geometric series with ratio $r$ is $$\sum_{k=0}^\infty r^k = \frac{1}{1-r}$$ if $|r|<1$; otherwise, we say the series diverges and does not have a sum.

This statement is formal since it explains its required assumptions ($|r|<1$) and does not leave the reader guessing as to whether $1+2+4+\ldots$ was intended to have a sum.

Nevertheless, neither statement is considered incorrect; both of them are correct with reasonable assumptions.