Say $a=b$. Is "Do the same thing to both sides of an equation, and it still holds" an axiom?

6.2k Views Asked by At

Recently I have started reviewing mathematical notions, that I have always just accepted. Today it is one of the fundamental ones used in equations:

If we have an equation, then the equation holds if we do the same to both sides.

This seems perfectly obvious, but it must be stated as an axiom somewhere, presumably in formal logic(?). Only, I don't know what it would be called, or indeed how to search for it - does anybody knw?

5

There are 5 best solutions below

11
On BEST ANSWER

This axiom is known as the substitution property of equality. It states that if $f$ is a function, and $x = y$, then $f(x) = f(y)$. See, for example, Wikipedia.

For example, if your equation is $4x = 2$, then you can apply the function $f(x) = x/2$ to both sides, and the axiom tells you that $f(4x) = f(2)$, or in other words, that $2x = 1$. You could then apply the axiom again (with the same function, even) to conclude that $x = 1/2$.

6
On

"Do the same to both sides" is rather vague. What we can say is that if $f:A \rightarrow B$ is a bijection between sets $A$ and $B$ then, by definition

$\forall \space x,y \in A \space x=y \iff f(x)=f(y)$

The operation of adding $c$ (and its inverse subtracting $c$) is a bijection in groups, rings and fields, so we can conclude that

$x=y \iff x+c=y+c$

However, multiplication by $c$ is only a bijection for certain values of $c$ ($c \ne 0$ in fields, $\gcd(c,n)=1$ in $\mathbb{Z}_n$ etc.), so although we can conclude

$x=y \Rightarrow xc = yc$

it is not safe to assume the converse i.e. in general

$xc=yc \nRightarrow x = y$

and we have to take care about which values of $c$ we can "cancel" from both sides of the equation.

Some polynomial functions are bijections in $\mathbb{R}$ e.g.

$x=y \iff x^3=y^3$

but others are not e.g.

$x^2=y^2 \nRightarrow x = y$

unless we restrict the domain of $f(x)=x^2$ to, for example, non-negative reals. Similarly

$\sin(x) = \sin(y) \nRightarrow x = y$

unless we restrict the domain of $\sin(x)$.

So in general we can only "cancel" a function from both sides of an equation if we are sure it is a bijection, or if we have restricted its domain or range to create a bijection.

4
On

There is a way in which it is an axiom, and another in which it isn't. I'll try to describe both.

When you do formal logic and start with variables ($x_1,...,x_n,...$), relation symbols ($R_i, i\in I$) and function symbols ($f_j, j\in J$). You build out your formulas with these. An important notion is that of a term in this language. A term is defined recursively as either a variable, or a string of the form $f_j(t_1,...,t_n)$ where $f_j$ is a function symbol of arity $n$ and $t_1,...,t_n$ are terms (this is purely syntactical).

Then you build a deduction system that consists in rules that you may apply in certain situations (for instance if you proved $A$ and $B$, you can prove $A\land B$).

One of these rules is the substitution rule: one way to define it is the following : for any terms $t_1,...,t_n, u_1,...,u_n$ and any function symbol $f$ of arity $n$, if for all $i$, $t_i = u_i$ is proved then one may deduce $f(t_1,...,t_n) = f(u_1,...,u_n)$

In this situation it is an axiom.

However in the common situation of algebra and "the working mathematician", it is a consequence of another substitution rule, the substitution rule for relation symbols. Indeed, all maths can be built from set theory with no function symbol and only one relation symbol ($\in$). In this setting a function $A\to B$ is defined (for instance) as a subset $f$ of $A\times B$ such that for all $x\in A$, there is a unique $b\in B$ such that $(a,b)\in f$. $f(a)$ is then defined as this unique $b$

Now if $x=y \in A$, $f:A\to B$ is a function, then $(x,f(x)) \in f$ and $(y,f(y))\in f$ and so by the substitution rule for relation symbols $(x,f(y))\in f$, so that $f(x)=f(y)$ (by uniqueness).

The substitution rule for relation symbols is very similar to the one for function symbols and is, again an axiom: it can be described as: if $t_1,...,t_n,u_1...,u_n$ are terms, $R$ is a relation symbol of arity $n$; if for all $i$, $t_i= u_i$ has been proved and $R(t_1,...,t_n)$ has been proved, then one may deduce $R(u_1,...,u_n)$

(you may see that the rule is not so far from an axiom, but technically it's not one in this second situation)

1
On

I remember being confused about the difference between arguments that were written in the "do the same thing to both sides" format, e.g. \begin{align} 7x &= 3 - 5x \\ 12x &= 3 \\ x &= 3/12, \end{align} and arguments that were written down in a sort of "chain of equalities" format, e.g. \begin{align} x = \frac{1}{12}{(12x)} = \frac{1}{12}(7x + 5x) = \frac{1}{12}(3 - 5x + 5x) = \frac{3}{12}. \end{align} With such a simple example, the latter looks a bit silly but I think I would say that in the end, the more mathematically mature type of writing is the latter. Somehow it became clearer to me when I realized that if you are manipulating an equation then there is only one number there (it's just, let's say, written down in two different ways).

So the format where you write an equation and do the same thing to both sides is sort of a suspension of disbelief until you find $x$ at the end, i.e. you are sort of not-fully-willing to accept the equality until you find an $x$ for which it is true, but to find an $x$ for which it is true you are assuming that there is one. This is completely natural. This 'suspension of disbelief' allows you to solve the equation... you can't actually start with $x = ....$ and just intuit what to write in the "chain of equalities" format, you can only do that because you've already worked it out. And yet after you've worked it out and written it down in this way, it's actually surprisingly easy to read because each step in the chain of equalities is just a re-writing of exactly the same number . I'm not 'doing' anything to 'sides' of some abstract entity called an 'equation'.

1
On

Any deterministic process that is performed on the same inputs will result in the same output. That is the definition of "deterministic". If two things are equal, then you're applying the same process to the same thing, so you get the same output. The expression "x = 5" is claiming that the symbol "x" and the symbol "5" represent the same value. Thus, "f(x)" and "f(5)" also must represent the same value, since they are both f applied to the same value.

In CS, the term "function" is sometimes used to refer to nondeterministic processes; for instance, "choose a random number from between 1 and n" might be considered a "function" of n by some programmers (though not functional programmers), even though the process applied to the same number can result in different outputs. So you do have to be sure that your process is deterministic.

Another issue one has to be careful abut is that the process takes values as inputs, not expressions. For instance, if $f$ represents the process "subtract 1 from the numerator and denominator", then $f(\frac{2}{3}) =\frac{1}{2} $ while $f(\frac{4}{6}) =\frac{3}{5} $ ; although the expressions $\frac{2}{3}$ and $\frac{4}{6}$ represent the same value, $f$ applied to them do not result in expressions that represent the same value, because $f$ is acting on the representations of those values, and not the values themselves. Sometimes, a process that is expressed as acting on the representation can still be well-defined as a function of the value. For instance, squaring the numerator and denominator results in the same value, regardless of what representation of a rational number you pick.

A final issue is that although a = b implies f(a) = f(b), f(a) = f(b) does not imply a = b. This means that not only do you have to be careful about "canceling" an operation that's been done to both sides, you also have to be careful about getting extraneous solutions.