Do definitions have to fit axioms in logic?

324 Views Asked by At

One thing I find confusing in propositional logic is that we have things like axioms and inference rules but then we seem to be able to define whatever we want in syntax that doesn't necessarily adhere to the axiom formats.

For example

https://en.wikipedia.org/wiki/Propositional_calculus#Example_1._Simple_axiom_system

This example system uses the modus ponens inference rule:

$P, P \to Q \vdash Q$

And the following axioms:

I. $(p \to (q \to p))$

II. $((p \to (q \to r)) \to ((p \to q) \to (p \to r)))$

III. $((\lnot p \to \lnot q) \to (q \to p))$

We have the $\lnot$ and $\to$ operator in the language but then we define $a \land b = \lnot(a \to \lnot b)$ even though this format does not match any of the three axioms, nor have we defined equality.

Why is this permitted? What are we allowed to define? What are we even using modus ponens and the axioms for if we can just make up whatever?

4

There are 4 best solutions below

0
On BEST ANSWER

Your question arises from the failure of many texts in properly distinguishing between the meta-system and the actual formal system under study. You, at all times, are doing mathematics in the meta-system, and in the field of mathematical logic you are studying some formal system (such as the one you have here with some syntactic rules for forming well-formed formulae (wff) and one deductive rule and three axioms). So, let us precisely express them, and you will see. $ \def\quote#1{{``}#1{"}} \def\meta#1{\mathbin{\dot#1}} $

Syntactic rules

Note that wffs are strings. Given any two strings $x,y$ we shall use "$x+y$" to denote the concatenation of $x$ followed by $y$. We shall also use quotes to specify literal strings. For example, you are a person but "you" is a string.

Closure under negation: Given any wff $A$, the string $\quote\neg+A$ is also a wff.

Closure under implication: Given any wffs $A,B$, the string $\quote(+A+\quote\to+B+\quote)$ is also a wff.

Note how I used quote-marks above. It would be technically incorrect to write:

... the string $(A \to B)$ is also a wff. (technically incorrect)

Because the "$\to$" and the brackets are symbols in the formal system under study, not symbols in the meta-system we are using!

Deductive rules

The system under study has only one deductive rule:

Given any wffs P,Q, if you have deduced $P$ and $\quote(+P+\quote\to+Q+\quote)$, then you can deduce $Q$.

Again, note how I used quote-marks.

Abbreviative definitions

Now we come to the so-called 'definition' of "$\land$":

Take any strings $A,B$. The string $\quote(+A+\quote\land+B+\quote)$ is not a wff in the formal system under study, simply because "$\land$" is not a symbol in its language. However, we wish to use that string to stand for $\quote{\neg(}+A+\quote{\to\neg}+B+\quote)$.

This wish is not trivial to fulfill rigorously. The easiest way to do it correctly is to add a syntactic rule for closure of wffs under $\quote\land$:

Closure under conjunction: Given any wffs $A,B$, the string $\quote(+A+\quote\land+B+\quote)$ is also a wff.

and then check that you can still uniquely parse (interpret) a wff, so that it makes sense to stipulate that $\quote(+A+\quote\land+B+\quote)$ is rewritten as $\quote{\neg(}+A+\quote{\to\neg}+B+\quote)$ before parsing, to obtain our wish.

As you observed, such a rewrite-rule is not an axiom.

What is that 'equality'?

Note that I did not say that $\quote(+A+\quote\land+B+\quote)$ is the same string as $\quote{\neg(}+A+\quote{\to\neg}+B+\quote)$, because it is of course false. We are only using a rewrite-rule; the strings themselves are not equal.

You are equally free to 'define' any other notation in the same fashion, using rewrite-rules, and you would have to deal with the same issue of unique parsing. This happens in mathematics itself as well. When you define a new notation it is important that there is still only one way to read things.

So while it is technically wrong to state this rewrite-rule as an equality, it is intuitively 'equal' in the sense of being logically equivalent, since the final parsing is the same.


I hope that this addresses your inquiry. If everything is clear, you can continue reading. There is a different way to go about logic that would actually make what is technically wrong above correct, but it may be confusing unless you fully understand the more concrete way above.

Meta-operators

First let us see how we can abstract out the wff formation:

Given any string $A$, define $\meta\neg A = \quote\neg+A$.

Given any strings $A,B$, define $A \meta\to B = \quote(+A+\quote\to+B+\quote)$.

Note that unlike the strings $\quote\neg$ and $\quote\to$, $\meta\neg$ and $\meta\to$ are operations on strings (in the meta-system). So we can in fact do the following:

Given any strings $A,B$, define $A \meta\land B = \meta\neg( A \meta\to (\meta\neg B) )$.

Note that the brackets here are in the meta-system, used so that we know which string operation to perform first. If we use the typical precedence rules, namely that $\meta\neg$ is higher precedence than $\meta\to$, then we could have done the following:

Given any strings $A,B$, define $A \meta\land B = \meta\neg( A \meta\to \meta\neg B )$.

A more abstract way to conceptualize this is that $\meta\to$ and $\meta\neg$ are actually operations on parse trees rather than strings, and so the above definition of $\meta\land$ is just a definition of a new operation on parse trees in terms of previously defined ones.

The question that may arise at this point is: Why don't we do it this way and not use strings at all? The simple answer is that the only way to completely formalize a formal system is to be able to encode it into some linear representation such as strings, so you are still going to have to decide on how exactly to encode wffs as strings. Similarly when you use logic on paper. Hence the concrete first approach is ultimately the practical way.

31
On

You can define anything you want. However, the point of defining something is to make it easier to refer to, which means that the most useful definitions are for things that are:

(a) frequently referred to;

(b) not trivial; and often

(c) similar to something else

So, for example, we define $\wedge$ because it allows for a lot of shortcuts in writing the propositional logic, and it happens to align with the general understanding of the word "and". The "=" in the definition isn't really part of the logic, it's a part of the language surrounding it, and we know that there's a level at which we have to resort to shared understanding since you can only abstract things so far.

On the other hand, I probably wouldn't bother coming up with a definition for "the set of all even prime numbers in $\mathbb{N}$", because it's simple enough to just say $\{2\}$. Or if I did define it, it would only be for a very limited context (for example, one where I actually needed to prove that 2 is the only element in the set), so I could get away with a generic definition like $A$.

3
On

I'm going to guess that you're conflating two different notions, namely "well-formed" and "logically valid". (My guess is admittedly based on just one little piece of one of your comments, namely "valid / makes sense".)

Of those two notions, only "well-formed" is relevant to definitions. You can define new symbols to abbreviate any well-formed formula, for example $\neg(a\to\neg b)$. The well-formed formulas are the ones that "make sense", i.e., have a truth value once you specify truth values for the variables in them. For example (check this with a truth table if you haven't already done so), $\neg(a\to\neg b)$ is true if both $a$ and $b$ are true, but $\neg(a\to\neg b)$ is false in all other circumstances.

Of the two notions, only "logically valid" is governed by the axioms and inference rules. The axioms are certain, selected, logically valid formulas, and the inference rules enable us to produce additional logically valid formulas from the axioms. We'll never produce $\neg(a\to\neg b)$ that way, because it's not logically valid. As indicated above, it's false sometimes (whenever at least one of $a$ and $b$ is false).

So $\neg(a\to\neg b)$ is not valid, but it is well-formed. In other words, it's not always true, but it always makes sense, it always has a truth value (when $a$ and $b$ have truth values). And the latter is what's needed for the defined expression $a\land b$ to make sense.

0
On

The key issue here is soundness.

The purpose of definitions in propositional calculus lies in converting notions not using the primitive connectives into well-formed formulas using only the primitive connectives, and doing the converse. In other words, definitions exist to translate between connectives.

There's an alternative way of expressing definitions by having a propositional calculus with functorial variables instead of a propositional calculus without functorial variables. Basically, it turns out that definitions which define connectives convert into tautologies with functorial variables of the form (in Polish notation)

C $\delta$x $\delta$y

where x is one-side of the definition, and y is the other-side of the definition. It also turns that if C $\delta$x $\delta$y, then Exy, and if Exy then C $\delta$x $\delta$y. Correspondingly, every definition has the property that one-side is logically equivalent to the other side of the definition. Thus, for any definition of a connective, if a well-formed formula gets written in Polish notation, the connective should appear once as the first symbol in the well-formed formula and only appear once in that well-formed formula. If some other formula equals it, then one can reasonably define that connective by that other formula.

For example, one common definition (again in Polish notation) is:

Apq := CNpq

which defines logical disjunction in terms of implication and negation. But, since

E Apq CCpqq

is a tautology also, and A appears the first symbol in 'Apq' and only appears once in 'Apq', one could use 'CCpqq' to define 'A' instead of using 'CNpq'.

To go over your questions one by one:

"Why is this permitted?"

Because anytime an instance of the formula (once parentheses get restored) on one side appears within a well-formed formula W, it can replace can get replaced by the formula (once parentheses get restored) on the right without W' changing from true to false, or from false to true. Or in short, definitional replacement preserves truth (this property is immediately evident for a formula like C $\delta$x $\delta$y once you understand how substitution for $\delta$ works). Thus, it doesn't result in an invalidity. So, if the axioms are sound, definitional replacement preserves soundness.

"What are we allowed to define?"

Any connective can get defined in terms of formulas only having the primitive connectives of the system. This gets done to ensure that the system is adequate.

"What are we even using modus ponens and the axioms for if we can just make up whatever?"

Because only a subset or subclass of "whatever" will qualify as logically sound. Modus ponens is sound. So are the axioms. The definitions also either are sound, or work out as consistent with soundness. So again, the key issue here is soundness.