Surprisingly elementary and direct proofs

5.1k Views Asked by At

What are some examples of theorems, whose first proof was quite hard and sophisticated, perhaps using some other deep theorems of some theory, before years later surprisingly a quite elementary, direct, perhaps even short proof has been found?

A related question is MO/24913, which deals with hard theorems whose proofs were simplified by the development of more sophisticated theories. But I would like to see examples where this wasn't necessary, but rather the theory turned out to be superfluous as for the proof of the theorem. I expect that this didn't happen so often. [Ok after reading all the answers, it obviously happened all the time!]

9

There are 9 best solutions below

5
On

Perhaps the most exciting and dramatic of the difficult inequalities is Arhangel'skii's theorem that $|X|\le \exp(L(X)\chi(X))$ for every Hausdorff space. The countable version of this result, namely that every Lindelöf, first countable, Hausdorff space has cardinality at most $\mathfrak c$, answer the following fifty-year old question of Alexandroff and Urysohn. Does there exists a compact, first countable space having cardinality greater that the continuum?

As one might guess, Arhangel'skii's original proof was quite difficult. The argument given in Set theoretic Topology Page 19 is due to Pol. It is not difficult for one to undestand. The countable version of this proof should be within the reach of any first-year graduate student in mathematics. The theorem is sufficently important to be included in any introductory graduate course in set-theoretic topology, and provides exposure to modern topology at an early level of mathematical training.

1
On

Gauss's initial proof of quadratic reciprocity in Disquisitiones was exceptionally long spanning I believe 30+ pages, and in my opinion it is incredibly nasty. Gauss used induction and then reduced to a something like eight cases. While not necessarily relying on big theorems this proof seems overly complicated especially when compared to some of the proofs that followed.

For example, Eisenstein's proof is quite short and has a very nice geometric component that I think is very nice, and it is does not require a great deal more machinery than Gauss's original.

There are in fact may other nice proofs of quadratic reciprocity. Here's a nice list of some of them

0
On

Too long for a comment.

I think Bill Johnson's answer (Lomonosov 1973) on MO applies to your question as well. Some superfluous theory that was used for the first proof of a weaker result (Bernstein-Robinson 1966) was nonstandard analysis. It was Halmos who immediately removed the nonstandard analysis from the argument.

These things were certainly not easy to figure out because von Neumann himself only proved the existence of nontrivial invariant subspaces for compact operators on Hilbert space ($\dim \geq 2$). Lomonosov entails hyperinvariant ones on general Banach spaces.

See this wikipedia page on the invariant subspace problem for statements and a chronology of related results. Maybe that's a good opportunity to recall that it is still not known whether every bounded operator on an infinite-dimensional separable Hilbert space admits a nontrivial invariant subspace.

9
On

Goedel's original completeness/compactness proofs in logic are very hard and technical. Modern versions of the proofs are considerably simpler and do not use any sophisticated new theory.

Most proofs of the fundamental theorem of algebra use some topological/homotopical deep(ish) results or some deep(ish) results of complex analysis. In fact, the theorem can be proved using only the definition of complex numbers and absolute value, using very elementary properties of complex numbers, and the entire proof is about half a page long (such a proof is actually a minimum modulus argument but applied to a polynomial so that the computation can be done directly without appeal to the general minimum modulus principle). The proof is extremely elementary.

The proof that Euclide's fifth postulate is independent of the other axioms of Euclidean geometry by exhibiting models (i.e., the sphere and the hyperbolic plane) where the postulate fails (in different ways) is completely elementary. Certainly all those who tried to prove/disprove the fifth postulate did so while walking on a counterexample. However, the numerous attacks on the problem prior to its settlement in the 20th century were quite sophisticated and laborious. The barrier was conceptual - the understand of the importance of models - not technical.

Brouwer's topological proof of his fixed point theorem used (I believe) homology and/or the fundamental group. A perfectly elementary proof using Sperner's Lemma was later discovered (I'm not entirely sure about the chronological order here, so I may be mistaken).

Initial proofs that the higher homotopy groups are all abelian were greatly simplified by the Eckman-Hilton argument (which is completely elementary).

The Mac Lane coherence theorem in category theory is rather technical and is enormously simplified by considering the Yoneda embedding in the 2-categorical setting. All of the machinery was already present, but it required some re-assembly to figure out a rather elementary proof.

1
On

Chebyshev had first proven the Bertrand's postulate which states that for any integer $n > 3$, there always exists at least one prime number $p$ with $n < p < 2n − 2$. A weaker but more elegant formulation is: for every $n > 1$ there is always at least one prime $p$ such that $n < p < 2n$.

Later Paul Erdős had given an elementary yet elegant proof of this. Surprisingly he was only 20 years old when he had given this proof.

2
On

Around 1911 the group theorist William Burnside proved his famous $pq$-Theorem:

Let $p$ and $q$ be primes and $G$ a group of order $p^aq^b$. Then G is solvable.

Burnside used linear representations over $\mathbb{C}$, in particular character theory. Although his proof was not extremely difficult, in the late 60s one began to look for a less sophisticated, that is a non-character proof, relying on new developments in group theory. In fact, David Goldschmidt was able to apply techniques that had been developed by his thesis adviser John Thompson and proved “by elementary means” the case where $p$ and $q$ are both odd. Not long after, Bender and Matsumaya found proofs for the remaining case and the combination of their arguments led to an overall simpler proof than the original one given by Goldschmidt.

0
On

In general proofs that are very intricate and laborious (like brute force for example) give way to more elegant proofs throughout the ages. However, these proofs rely on an arsenal of concepts much more sophisticated than their predecessors. But a language is more sophisticated conceptual offset by an economy in brute force.

I think the most beautiful example that satisfies your question is the story of the proofs of Sylow's Theorem. See in this paper a elegant proof for exemple.

The Sylow theorems have been proven for the first time by the Norwegian mathematician Ludwig Sylow in 1872. See wikipedia for references.

"Classical proofs" using the Conjugacy class equation and insights in counting are proofs very laborious. A proof with Conjugacy class equation can be seen in the excelent book Abstract Algebra: An Introduction by Thomas W. Hungerford.

In outher execelent Hungerford's book we see a simple, elegant and sophisticated proof ( see p. 93) using the concept of group action ( see p. 88) .

1
On

The Szemeredi-Trotter incidence theorem gives an upper bound on the total number of incidences between a finite set of points and a finite set of lines in the plane. An incidence is a pair $(\ell, x)$ consisting of a line $\ell$ and a point $x \in \mathbb{R}^2$ such that $x \in \ell$.

The original proof of the theorem consisted of a tricky cell-decomposition trick which abound in additive combinatorics. The newer slick proof comes from an argument of Szekely which uses elementary graph theory techniques. The details are summarized very nicely by Terry tao here and here. Ultimately the proof relied on Euler's formula $f -e + v = 2$, the basic inequalities one sees when studying graphs for the first time, such as $e \leq 3v - 6$, and an upper bound on the crossing number of a specific subgraph. The newer proof is much easier to grasp and can certainly be taught to undergraduates.

An overview of the older formulation can be found on Terry Taos'b log (specifically here and here).

2
On

Please allow me to introduce an interested proof of the following theorem:

Theorem. Let $f:X\to Y$ be a quasi-finite projective morphism of locally Noetherian schemes. Then $f$ is affine (hence, in particular, finite).

Proof. We may assume without loss of generality that $Y$ is connected. By assumption, there exists a locally free sheaf $E$ of rank $r+1 < \infty$ on $Y$, and a closed immersion $i:X \to \mathbb{P}_Y(E)$ over $Y$. Write $L:= \mathcal{O}_{\mathbb{P}_Y(E)/Y)}(1)$. Then we obtain (by pulling back of the Eular sequence) the following exact sequence on $X$: $$ \require{AMScd} \begin{CD} 0 @>>> i^*\Omega_{\mathbb{P}_Y(E)/Y}\otimes L @>{k}>> f^*E @>>> L @>>> 0. \end{CD} $$ Now, the surjection $k^{\vee}:f^*E^{\vee} \to i^*(\Omega_{\mathbb{P}_Y(E)/Y}\otimes L)^{\vee}$ induces a proper morphism $p:\mathbb{P}_X(i^*(\Omega_{\mathbb{P}_Y(E)/Y}\otimes L)^{\vee}) \to \mathbb{P}_Y(E^{\vee})$ over $Y$. Write $$U := \mathbb{P}_Y(E^{\vee})\setminus \mathrm{Im}(p).$$ Then $j:U\hookrightarrow \mathbb{P}_Y(E^{\vee})$ is open, hence $g:U\to Y$ is flat and quasi-compact. Moreover, since $X$ is quasi-finite over $Y$, by considering that after base-change to $\mathrm{Spec}(k(y))$, we conclude that $U$ is surjective. Thus $U$ is fpqc over $Y$.

Write $\varphi: g^*E^{\vee}\to j^*\mathcal{O}_{\mathbb{P}_Y(E^{\vee})/Y}(1)$ for the surjection induced by $j$. Then the surjection $\varphi^{\vee}: g^*E\to \ker(\varphi)^{\vee}$ induces a closed immersion $$\mathbb{P}_U(\ker(\varphi)^{\vee})\hookrightarrow \mathbb{P}_U(g^*E) \cong \mathbb{P}_Y(E) \times_Y U.$$ Write $V\subset (\mathbb{P}_Y(E) \times_Y U) \setminus \mathbb{P}_U(\ker(\varphi)^{\vee})$. Then $V$ is affine over $U$. Moreover, by our construction, $i\times_Y \mathrm{id}_U:X\times_YU \to \mathbb{P}_Y(E)\times_Y U$ factors uniquely through $V$. Since $i\times_Y \mathrm{id}_U$ is a closed immersion, this implies that $X\times_Y U$ is affine. Since $U$ is fpqc over $Y$, $X$ is also affine over $Y$. $\square$