Why does one have to check if axioms are true?

4.2k Views Asked by At

In Tao's book Analysis 1, he writes:

Thus, from the point of view of logic, we can define equality on a [remark by myself: I think he forgot the word "type of object" here] however we please, so long as it obeys the reflexive, symmetry, and transitive axioms, and it is consistent with all other operations on the class of objects under discussion in the sense that the substitution axiom was true for all of those operations.

Does he mean that, if one wants to define define equality on a specific type of object (like functions, ordered pairs, for example), one has to check that these axioms of equality (he refers to these four axioms of equality as "symmetry", "reflexivity", "transitivity", and "substitution") hold in the sense that one has to prove them? It seems so, because of these two passages:

[In section 3.3 Functions] We observe that functions obey the axiom of substitution: if $x=x'$, then $f(x) = f(x')$ (why?).

(My answer would be "because that's an axiom", but Tao apparently wouldn't accept that.)

And after defining equality of sets ($A=B:\iff \forall x(x\in A\iff x\in B)$), Tao writes (on page 39):

One can easily verify that this notion of equality is reflexive, symmetric, and transitive (Exercise 3.1.1). Observe that if $x\in A$ and $A = B$, then $x\in B$, by Definition 3.1.4. Thus the "is an element of" relation $\in $ obeys the axiom of substitution

So he gives the exercise to prove the axioms of equality for sets. Why does one has to prove axioms? Or, put differently: if one can prove these things, why does he state them as axioms?

7

There are 7 best solutions below

25
On

I believe Tao means that the axioms of reflexivity, symmetry, and transitivity are adequate to capture our pre-existing (this is the key) intuition about what "equality" between two objects ought to be. Let me try two contrasting examples to help unpack what I mean.

Version 1

You: Sets $A$ and $B$ are shmequal provided $x \in A \Leftrightarrow x \in B$ for all $x$.

Me: That sounds like a fine relation to investigate. Creative name, by the way.

Version 2

You: Sets $A$ and $B$ are equal provided $x \in A \Leftrightarrow x \in B$ for all $x$.

Me: Now, hold on just a second. By "equal", you mean "identical" or "exactly the same"? I'm not sure I'm ready to accept that this abstract definition captures all that. You would need to show me that the relation $x \in A \Leftrightarrow x \in B$ for all $x$ is reflexive, symmetric, and transitive before I'm willing to concede that this deserves a name like "equal".


Comment from OP

An example came to mind: we define equality for ordered pairs: $(x,y)=(a,b)⟺x=a∧y=b$. To show that equality for ordered pairs is reflexive we need to show $(x,y)=(x,y)$, which by definition means $x=x∧y=y$. But $x$ and $y$ could itself be ordered pairs. So now we are in the situation where we have to prove that every ordered pair is equal to itself but where we also have to accept this as given.

A weak response from me

I struggle to find good words to address your question, but it might help to remember that we are not checking whether $(x,y) = (x,y)$. Rather this is one of the things we insist should be the case if "equality" is to mean anything; it must apply to identical objects. Instead, we are asking "For ordered pairs, does the $x = a \wedge y = b$ property capture this self-evident truth about equality?". We find that it does: $x = x$ and $y = y$ are both true statements because we are comparing two identical objects in each case.

5
On

You are probably confused because you think that axioms are (by definition) statements that we take as true without any proofs. However, this word has a slightly different meaning.

Axioms are a starting point of a mathematical theory. When you build a theory, for example, Arithmetic, from scratch, you need some preliminary facts, otherwise you cannot prove anything. In Arithmetic and a bunch of other mathematical theories the described properties of equality are indeed axioms, that one does not prove. Equality is a primitive notion, and the only sensible way to actually define it is to postulate that these natural (as it seems to us humans) properties hold.

However, in set theory, these "axioms" are not the definition of equality. Rather, equality is defined via the formula above: two sets are equal when they consist of identical elements. But when we define equality in this way, there is a natural question: why we are naming this as "equality" at all? This is why we prove "axioms of equality", which we are already used to, to show that the naming "equality" is adequate. And when we prove them, they become theorems of set theory and properties of equality rather than axioms. This is because set theory is more fundamental and more powerful than most mathematical theories in a sense that you can build (almost) all mathematics based on it.

3
On

There are axioms and then there axioms. Most of the time mathematicians use the word "axiom" they mean it in a definitional sense. Instead of "An equality is a relation satisfying the reflexivity, symmetry, transitivity, and substitutivity axioms" think instead "By definition, equality is any relation for which reflexivity, symmetry, transitivity, and substitutivity hold". In other words, you can call some relation an equality if you can show that it meets the definition, i.e. that reflexivity, symmetry, transitivity, and substitutivity are true of it. So the answer to your first question is "yes". To put it another way, these "axioms" are true of equality relations, but you have to show that your relation is in fact an equality relation. This is exactly the same situation as the definition of a mathematical group for instance.

To put it an even better way, these "axioms" are axioms in the "theory of equality" and you want to show that a particular relation is a model/interpretation/semantics for that theory. Very briefly and roughly sketching, in formal logic, a theory is a collection of symbols and a collection of rules. A theory will define certain arrangements of symbols to be formulas (or sentences or terms). There will also be a notion of "theoremhood" defined by the rules. Typically the rules will have the form "if these arrangements of symbols are theorems, then this arrangement of symbols is a theorem". You could call these rules "axioms", though in this context usually only rules with no premises, i.e. that simply baldly state "this arrangement of symbols is a theorem" with no conditions, are called "axioms". However, the "axiom of symmetry", say, corresponds more closely to a rule than an axiom in this stricter sense. A theory (and the logic in which it is formulated) thus gives rise to a language.

Of course, we usually want to talk about circles and torii and other mathematical objects that we don't (usually...) think of a "arrangements of symbols". To connect a theory to some mathematical objects we use a semantics which is an assignation of mathematical objects to the arrangements of symbols in a consistent manner (usually satisfying some conditions dependent on the logic in which the theory is formulated) such that the rules are satisfied. If the rules aren't satisfied, then the assignment is not a semantics for the theory.

So Tao is (implicitly) specifying a theory of equality and these exercises are asking you to show that particular interpretations (i.e. assignments) are semantics for that theory.

So how does this relate to set theory as the "foundation" of mathematics or axioms as "self evident truths"? As far as "foundations" are concerned the situation is that mathematicians for the most part have agreed to (pretend to) work within the language of Zermelo-Fraenkel (ZF) set theory which is a theory in first-order logic. There is no question of "true" or "false" in this scenario. We simply have some formulas that are called theorems and rules for making more theorems. The rules/axioms simply define what a "theorem" is. The axioms in the stricter sense are then the starting points for deriving theorems as Wolfram said. However, we can talk about semantics for ZF set theory and to show that an assignment is a semantics we would indeed be obliged to prove the "axiom of pairing" and the "axiom of infinity" and all the other axioms of ZF set theory hold for our interpretation. This is something that is done in set theory and logic.

Pragmatically, as I stated in the first paragraph, you should just interpret "axiom" in this and most cases as "condition that needs to hold to meet the definition". The rest of this answer was more explaining how this usage of the word "axiom" is, in fact, more or less consistent with the usage in e.g. "axioms of set theory" or "axioms of geometry".

0
On

I've personally come to think of axioms as little components of a big definition. For example, the axioms at the start of the Elements define what we mean by "Euclidean geometry"; the Peano axioms serve to define what we mean by "natural number", and so forth.

This gives a somewhat cleaner style than a definition that goes on for a paragraph or two with lots of conjunctions. And it serves to make the parts distinct, so that we can study the effects of maybe swapping out or changing one of them and keeping the rest the same.

So Tao is listing all the sub-parts in the definition of what we mean by "equality". What equality means is assumed in advance by these axioms. Now the question is: Does a proposed new relation really count as equality, or not? That needs to be established by proving it meets all of the criteria, that is, all of the component axioms of the definition.

0
On

So he gives the exercise to prove the axioms of equality for sets. Why does one has to prove axioms? Or, put differently: if one can prove these things, why does he state them as axioms? $ \def\imp{\Rightarrow} \def\eq{\Leftrightarrow} $

He does not prove the axiom as stated. The axiom claims equality between two sets iff they have exactly the same members. This equality symbol "$=$" is the symbol in the foundational system itself, and there is no way you will be able to prove an axiom of the foundational system if it is independent of the other axioms. In fact, the equality symbol is part of first-order logic itself. So what exactly does Terence Tao mean?

He wrote:

One can easily verify that this notion of equality is reflexive, symmetric, and transitive (Exercise 3.1.1). Observe that if $x∈A$ and $A=B$, then $x∈B$, by Definition 3.1.4. Thus the "is an element of" relation $∈$ obeys the axiom of substitution.

This is logically not precise unless he is working in first-order logic without equality. It would be clearer to define the binary relation $\sim$ such that $A \sim B$ iff $\forall x\ ( x \in A \eq x \in B )$. Then it makes sense to ask whether $\sim$ is an equivalence relation or not, and whether it obeys substitution. It is trivial to prove that it is indeed symmetric, reflexive and transitive. In a formal system, we should think of the notion that a relation $\sim$ obeys substitution as meaning that $x \sim y$ implies that $P(x) \eq P(y)$ for any $1$-input sentence $P$. It turns out that in a first-order language we do not need to check for every $1$-input sentence, because it suffices to check for each of the non-logical symbols of the language. Since the language of set theory only has one non-logical symbol "$\in$", the precise notion of "obeys substitution" in set theory is hence:

$\forall A,B\ ( A \sim B \imp \forall C\ ( C \in A \eq C \in B ) \land \forall C\ ( A \in C \eq B \in C ) )$.

Observe that the first half of the claim is trivially true by definition of $\sim$. I don't see the second half in the quotes of Terence Tao that you have in your question. If I'm not mistaken, it is not provable, in which case Terence didn't really prove full substitutivity.

0
On

The philosophy is that axioms need to be as simple as possible, basic, and few in number. In this way we can argue about the validity of such axioms very little. In contrast, look into the history of the parallel postulate. The axioms you mention have immediate consequences; however, if we do not prove those consequences they must be accepted as part of the axiom, and hence the axiom is more complex than necessary.

For instance,

We observe that functions obey the axiom of substitution: if $x=x′$, then $f(x)=f(x′)$ (why?).

does not follow from the equality axiom alone, but also from the definition of function. Note that it does not hold for all relations, such as $r(x)=\pm \sqrt x$. Therefore, to be minimalist in axioms, we must prove that substitution holds for functions. This is where set theory and mathematics logic become quite powerful, and we undertake these simplistic problems to cut our teeth.

Likewise,

One can easily verify that this notion of equality is reflexive, symmetric, and transitive (Exercise 3.1.1). Observe that if $x\in A$ and $A=B$, then $x\in B$, by Definition 3.1.4.

is not stated in the axiom, so must either be included in the axiom or proven. It looks like, and feels like, it is part of the axiom, and the proof is almost "because the axiom says so," but the minimalist approach demands the proof.

3
On

Austin Mohr's answer is excellent; however, as it led to an extensive argument in the comments I wish to put forth a slight simplification and restatement of it which may add something.

(I posted this as a comment originally, but wanted to expand on it.)


So he gives the exercise to prove the axioms of equality for sets. Why does one has to prove axioms? Or, put differently: if one can prove these things, why does he state them as axioms?

The basic point to realize here is that English words have meaning outside of math.

You're free to invent any relation you wish, with whatever properties you wish. You can label it with whatever made-up term you want without any necessity to prove anything about it.

If you do this, you are just defining what you are talking about, and you can then proceed to say something using your stated definitions.

However, if you use an English word to name or describe your relation, then you should take into account the English meaning of the word. That is, in choosing an English word as a name, you should choose one which aligns with the properties your relation has.

And the corollary: If you want to use a particular English word as a name for your relation, you should be sure your relation has the properties that would be implied by that name.

This applies whether we are naming relations, operators, or anything else.


If I define a unary operator and call it the "inverse," but my operator has the property that repeated application of that operator will never produce the original input, I have misnamed it.

If I define a relation and call it the "equality relation," but it is not transitive nor symmetric, I have again chosen the wrong name.

(Can you misname your relations and operators? Of course you can. The only thing it will break down is your communication with other people, which is of course the only reason to have names for things in the first place.)

A relation which is not reflexive, symmetric, and transitive, will violate the English-language meaning of the word "equal," so if your relation may not have those properties then use another name for it instead.


In this case, the "equality relation" between sets has been defined in your textbook, and it is now your task to show the appropriateness of the label "equality relation" for the defined relation, by showing that it has the properties one would expect out of the English-language meaning of the word "equality."