Is the proof of Pythagorean theorem using dot (inner) product circular?

6.2k Views Asked by At

$x,y$ are perpendicular if and only if $x\cdot y=0$. Now, $||x+y||^2=(x+y)\cdot (x+y)=(x\cdot x)+(x\cdot y)+(y\cdot x)+(y\cdot y)$. The middle two terms are zero if and only if $x,y$ are perpendicular. So, $||x+y||^2=(x\cdot x)+(y\cdot y)=||x||^2+||y||^2$ if and only if $x,y$ are perpendicular. ( I copied this)

I think this argument is circular because the property

$x\cdot y=0 $ implies $x$ and $y$ are perpendicular

comes from the Pythagorean theorem.

Oh, it just came to mind that the property could be derived from the law of cosines. The law of cosines can be proved without the Pythagorean theorem, right, so the proof isn't circular?

Another question: If the property comes from the Pythagorean theorem or cosine law, then how does the dot product give a condition for orthogonality for higher dimensions?

Edit: The following quote by Poincare hepled me regarding the question:

Mathematics is the art of giving the same name to different things.

4

There are 4 best solutions below

4
On BEST ANSWER

I think the question mixes two quite different concepts together: proof and motivation.

The motivation for defining the inner product, orthogonality, and length of vectors in $\mathbb R^n$ in the "usual" way (that is, $\langle x,y\rangle = x_1y_1 + x_2y_2 + \cdots + x_ny_n$) is presumably at least in part that by doing this we will be able to establish a property of $\mathbb R_n$ corresponding to the familiar Pythagorean Theorem from synthetic plane geometry. The motivation is, indeed, circular in that we get the Pythagorean Theorem as one of the results of something we set up because we wanted the Pythagorean Theorem.

But that's how many axiomatic systems are born. Someone wants to be able to work with mathematical objects in a certain way, so they come up with axioms and definitions that provide mathematical objects they can work with the way they wanted to. I would be surprised to learn that the classical axioms of Euclidean geometry (from which the original Pythagorean Theorem derives) were not created for the reason that they produced the kind of geometry that Euclid's contemporaries wanted to work with.

Proof, on the other hand, consists of starting with a given set of axioms and definitions (with emphasis on the word "given," that is, they have no prior basis other than that we want to believe them), and showing that a certain result necessarily follows from those axioms and definitions without relying on any other facts that did not derive from those axioms and definitions. In the proof of the "Pythagorean Theorem" in $\mathbb R^n,$ after the point at which the axioms were given, did any step of the proof rely on anything other than the stated axioms and definitions?

The answer to that question would depend on how the axioms were stated. If there is an axiom that says $x$ and $y$ are orthogonal if $\langle x,y\rangle = 0,$ then this fact does not logically "come from" the Pythagorean Theorem; it comes from the axioms.

14
On

Remember that $\mathbb{R}^n$ as a vector space over $\mathbb{R}$ with standard addition and scalar multiplication isn't rich enough to do geometry. When we talk about geometry, particularly the Euclidean geometry, we want to study the transformations that keep some properties like length and angles preserved, namely isometries. Most of our theorems will be about how two angles, lengths or congruent circles and we define two shapes to be congruent if and only if they can be transformed into each other by one of these 'transformations' (isometries). Viewing geometry from this point of view, i.e. through its group of transformations, is related to what is called the Erlangen program.

Anyway, it turns out that both of these properties, i.e. length and angle, can be packaged into one algebraic concept called the inner product (which I'm going to define below). More precisely, if $\vec{x} \in \mathbb{R}^n$, then

$$\| \vec{x} \| = \sqrt{\langle \vec{x}, \vec{x} \rangle}$$ $$\angle (\vec{x}, \vec{y}) = \arccos\bigg(\frac{\langle \vec{x},\vec{y}\rangle}{\sqrt{\langle \vec{x},\vec{x}\rangle} \sqrt{\langle\vec{y},\vec{y}\rangle}}\bigg)$$

The last definition makes sense because of the Cauchy-Schwartz inequality which is true for inner products: $$|\langle \vec{x},\vec{y} \rangle|^2 \leq \langle\vec{x},\vec{y} \rangle \langle \vec{y}, \vec{y}\rangle$$

And hence, there exists a unique angle between any two vectors.

Now, by definition, an inner product is a bilinear form $\langle \cdot, \cdot \rangle: \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}$ such that

  1. $\langle \vec{x}, \vec{x} \rangle \geq 0$ and $\langle \vec{x}, \vec{x} \rangle = 0$ if and only if $\vec{x}=\vec{0}$
  2. $\langle \vec{x}, \vec{y} \rangle = \langle \vec{y}, \vec{x} \rangle$
  3. $\langle \overrightarrow{\alpha x+\beta z}, \vec{y} \rangle = \alpha\langle \vec{x}, \vec{y} \rangle + \beta\langle \vec{z}, \vec{y} \rangle$

Again, by definition two vectors are perpendicular if and only if their inner product is zero. Accepting these definitions, the Pythagorean theorem can now be proven without any circular reasoning.

Now I understand that you feel kind of uncomfortable because you say that the standard inner product on $\mathbb{R}^n$ comes from the Pythagorean theorem for $n=2,3$. And I understand that some people here interpret this as circular reasoning. However, I have two issues with this interpretation:

  1. our definition for the standard inner product on $\mathbb{R}^n$ for $n>3$ doesn't come from any previously known fact. Does this mean that our definition for $n>3$ is less valid than for $n=2,3$ or we should give up on higher dimensions?

  2. It happens times and times in physics, as well as mathematics, that a new theory is based on some known facts in earlier theories as observations for a general pattern. Then to show that our new theory is better, they try to show that the results predicted by the previous theory can be obtained in the new framework as well. Should we say that it's circular reasoning? So, if someone applies a stronger theory to prove something that has motivated it in the earlier theory, is it circular reasoning? It seems that many people here think so. Even though I don't challenge it, I don't like this interpretation. I believe that once we have a new theory with its own definitions, which can be taught to a person without any prior knowledge of the earlier facts, it is not unreasonable to focus our attention only on our current framework and forget about its past like it never existed. Meanwhile, we can use our new theory to prove the results proven in our earlier theory.

0
On

If we start from Euclidean geometry in two dimensions, then it is circular. In this settings, we define perpendicular lines as lines that, at their intersection form four equal angles. Coordinate systems are defines using a basis consisting of two perpendicular vectors of the same length.

It is that actually easy to prove that two vectors are perpendicular iff their dot product is 0 using rotational properties. We would hit a snag, however, at the claim that the two-norm equals the length, which is a direct consequence of Pythagorean theorem.

Of course, that is not the context this proof of the Pythaogrean theorem normally shows up in. It shows up when we start from a different set of definitions: We define orthogonality using dot products. We define length using 2-norm. Then we prove a Vector space Pythagorean theorem as you have shown above, without circular reasoning.

Of course, now we have two different definitions for length and perpendicularity, and two different Pythagorean theorems. It is a separate step to prove that in $\mathbb{R}^2$ and $\mathbb{R}^3$, both sets of definitions actually mean the same things, and so both Pythagorean theorems are in fact the same theorem.

2
On

Let's try this on a different vector space. Here's a nice one: Let $\mathscr L = C([0,1])$ be the set of all real continuous functions defined on the interval $I = [0,1]$. If $f, g \in \mathscr L$ and $a,b \in \Bbb R$, then $h(x) := af(x) + bg(x)$ defines another continuous function on $I$, so $\scr L$ is indeed a vector space over $\Bbb R$.

Now I arbitrarly define $f \cdot g := \int_I f(x)g(x)dx$, and quickly note that this operation is commutative and $(af + bg)\cdot h = a(f\cdot h) + b(g \cdot h)$, and that $f \cdot f \ge 0$ and $f\cdot f = 0$ if and only if $f$ is the constant function $0$.

Thus we see that $f\cdot g$ acts as a dot product on $\scr L$, and so we can define $$\|f\| := \sqrt{f\cdot f}$$ and call $\|f - g\|$ the "distance from $f$ to $g$".

By the Cauchy-Schwarz inequality $$\left(\int fg\right)^2 \le \int f^2\int g^2$$ and therefore $$|f\cdot g| \le \|f\|\|g\|$$

Therefore, we can arbitrarily define for non-zero $f, g$ that $$\theta = \cos^{-1}\left(\frac{f\cdot g}{\|f\|\|g\|}\right)$$ and call $\theta$ the "angle between $f$ and $g$", and define that $f$ and $g$ are "perpendicular" when $\theta = \pi/2$. Equivalently, $f$ is perpendicular to $g$ exactly when $f \cdot g = 0$.

And now we see that a Pythagorean-like theorem holds for $\scr L$: $f$ and $g$ are perpendicular exactly when $\|f - g\|^2 = \|f\|^2 + \|g\|^2$


The point of this exercise? That the vector Pythagorean theorem is something different from the familiar Pythagorean theorem of geometry. The vector space $\scr L$ is not a plane, or space, or even $n$ dimensional space for any $n$. It is in fact an infinite dimensional vector space. I did not rely on geometric intuition to develop this. At no point did the geometric Pythagorean theorem come into play.

I did choose the definitions to follow a familiar pattern, but the point here is that I (or actually far more gifted mathematicians whose footsteps I'm aping) made those definitions by choice. They were not forced on me by the Pythagorean theorem, but rather were chosen by me exactly so that this vector Pythagorean theorem would be true.

By making these definitions, I can now start applying those old geometric intuitions to this weird set of functions that beforehand was something too esoteric to handle.

The vector Pythagorean theorem isn't a way to prove that old geometric result. It is a way to show that the old geometric result also has application in this entirely new and different arena of vector spaces.