Ok, buckle up for a rather long question. I've spent a large portion of today learning about compactness, stemming mainly from this wikipedia article about point-set topology. The article mentions three main things: continuity, connectedness, and compactness. I will address each in turn, but my question is mainly about the last one.
Continuity: At least having gone through high school calculus and basic university analysis, I think people have a great intuitive understanding of continuity (and also differentiability I guess): smooth = yay!, jagged = fine-ish, holes/jumps = really bad :(. The wiki article describes this as "taking nearby points to nearby points", which I understand well enough that I think given a decade or something I could eventually have come up with the formal epsilon-delta definition for continuity: $$\forall \varepsilon >0, \exists \delta, \text{ s.t. } |x-x_{0}|<\delta \Rightarrow |f(x)-f(x_{0})|<\varepsilon $$
Connectedness: Similarly, I think people have a great intuition for connectedness (at least path connectedness), which the wiki article summarizes nicely as "sets that cannot be divided into two pieces that are far apart". Again, I think that given a decade or two I could have gone in the right direction at least for the formal definition of connectedness: a set that can't be represented as the union of two or more disjoint non-empty open subsets.
First (Minor) Question: could we have a useful definition for connectedness being: "a set that can't be represented as the union of two or more disjoint non-empty closed subsets"?
Similarly, I feel like I could reasonably develop definitions for open sets (based on intuitions from number lines and basic high-school algebra/set theory), and completeness (basically the Least Upper Bound axiom/Cauchy sequences). However, there's one thing missing. Compactness. Never in a million years do I think I could have thought up of the definition that a set that can be "can be covered by finitely many sets of arbitrarily small size." I've looked at these five sites and some links therein:
- Why is compactness so important?
- What should be the intuition when working with compactness?
- https://www.math.ucla.edu/~tao/preprints/compactness.pdf
- https://www.reddit.com/r/math/comments/47h6hg/what_really_is_a_compact_set/
- https://arxiv.org/abs/1006.4131
but none of them so far has really clicked with me. Many people emphasized that it's a generalized version of finiteness with "fat blurry points", and I also understand that by the Heine-Borel Theorem compactness is equivalent to "closed and bounded" in Euclidean space, but those two things seem so far apart that it just seems like a black-magic coincidence that they describe the same phenomenon.
How would you motivate and explain the definition and concept of compactness to your students in such a way that they feel like they could have come up with it themselves, naturally, given a decade or two?
If you start with "it's a generalized version of finiteness" it seems like a complete and utter coincidence that it happens to be equivalent to "closed and bounded". I mean of all the possible "generalized finiteness formulations", how did ours get it right?
If you start with "it's just another way of saying closed and bounded", then students will feel that it's just more arbitrary confusion redefining things they already know (namely that of closed-ness and boundedness); furthermore, even if they did accept this explanation, they would have never figured out on their own that "every open cover has a finite subcover $\iff$ closed and bounded". "Finite subcover" just seems so out-of-left-field.
And finally if you go the sequential compactness route (referring to Tao's paper here) students will just say "ahh yes, the Bolzano-Weierstrass Theorem! Why need to give it a new name of compactness?"
Maybe I've missed something in my searches, but I hope this question isn't just a badly rehashed old question. I don't think my question is answered in the "Pedagogical History of Compactness" mainly because I don't want the convoluted history, but rather the simplest motivation and explanation based off modern curriculum and notation.
-> Also, thank you to those who've commented/left an answer. I hope that this page and all the differing pedagogical interpretations presented will serve as a relatively complete and comprehensive guide for beginners learning compactness in the future. Please upvote answers you think are particularly insightful; as a novice, I would appreciate some expert judgement on the explanatory power of these answers.
I'm going to make a stab at "compactness" here. Suppose you want to prove something about sets in, say, a metric space. You'd like to, say, define the "distance" between a pair of sets $A$ and $B$. You've thought about this question for, say, finite sets of real numbers, and things worked out OK, and you're hoping to generalize. So you say something like "I'll just take all points in $A$ and all points in $B$ and look at $d(a, b)$ for each of those, and then take the min."
But then you realize that "min" might be a problem, because the set of $(a,b)$-pairs might be infinite -- even uncountably infinite, but "min" is only defined for finite sets.
But you've encountered this before, and you say "Oh...I'll just replace this with "inf" the way I'm used to!" That's a good choice. But now something awkward happens: you find yourself with a pair of sets $A$ and $B$ whose distance is zero, but which share no points. You'd figured that in analogy with the finite-subsets-of-$\Bbb R$, distance-zero would be "some point is in both sets", but that's just not true.
Then you think a bit, and realize that if $A$ is the set of all negative reals, and $B$ is the set of positive reals, the "distance" between them is zero (according to your definition), but ...there's no overlap. This isn't some weird metric-space thing ... it's happening even in $\Bbb R$. And you can SEE what the problem is --- it's the "almost getting to zero" problems, because $A$ and $B$ are open.
So you back up and say "Look, I'm gonna define this notion only for closed sets; that'll fix this stupid problem once and for all!"
And then someone says "Let $A$ be the $x$-axis in $\Bbb R^2$ and let $B$ be the graph of $y = e^{-x}$." And you realize that these are both closed sets, and they don't intersect, but the distance you've defined is still zero. Damnit!
You look more closely, and you realize the problem is with $\{ d(a, b) \mid a \in A, b \in B\}$. That set is an infinite set of positive numbers, but the inf still manages to be zero. If it were a finite set, the inf (or the min -- same thing in that case!) would be positive, and everything would work out the way it was supposed to.
Still looking at $A$ and $B$, instead of looking at all point in $A$ and $B$, you could say "Look, if $B$ is at distance $q$ from $A$, then around any point of $B$, I should be able to place an (open) ball of radius $q$ without hitting $A$. How 'bout I rethink things, and say this instead: consider, for all points $b \in B$, the largest $r$ such that $B_r(b) \cap A = \emptyset$...and then I'll just take the smallest of these "radii" as the distance.
Of course, that still doesn't work: the set of radii, being infinite, might still have zero as its inf. But what if you could somehow pick just finitely many of them? Then you could take a min and get a positive number.
Now, that exact approach doesn't really work, but something pretty close does work, and situations just like that keep coming up: you've got an infinite collection of open balls, and want to take the minimum radius, but "min" has to be "inf" and it might be zero. At some point, you say "Oh, hell. This proof isn't working, and something like that graph-and-$x$-axis problem keeps messing me up. How 'bout I just restate the claim and say that I'm only doing this for sets where my infinite collection of open sets can always be reduced to a finite collection?"
Your skeptical colleague from across the hall comes by and you explain your idea, and colleague says "You're restricting your theorem to these 'special' sets, ones where every covering by open sets has a finite subcover .. .that seems like a pretty extreme restriction. Are there actually any sets with that property?"
And you go off and work for a while and convince yourself that the unit interval has that property. And then you realize that in fact if $X$ is special and $f$ is continuous, then $f(X)$ is also special, so suddenly you've got tons of examples, and you can tell your colleague that you're not just messing around with the empty set. But the colleague then asks, "Well, OK. So there are lots of these. But this finite-subcover stuff seems pretty...weird. Is there some equivalent characterization of these special sets?"
It turns out that there's not -- the "change infinite into finite" is really the secret sauce. But in some cases -- like for "subsets of $\Bbb R^n$ -- there is an equivalent characterization, namely "closed and bounded". Well, that's something everyone can understand, and it's a pretty reasonable kind of set, so you need a word. Is "compact" the word I'd have chosen? Probably not. But it certainly matches up with the "bounded"-ness, and it's not such a bad word, so it sticks.
The key thing here is that the idea of compactness arises because of multiple instances of people trying to do stuff and finding it'd all work out better if they could just replace a cover by a finite cover, often so that they can take a "min" of some kind. And once something gets used enough, it gets a name.
[Of course, my "history" here is all fiction, but there are plenty of cases of this sort of thing getting named. Phrases like "in general position", for instance, arise to keep us out of the weeds of endless special cases that are arbitrarily near to perfectly nice cases.]
Sorry for the long and rambling discourse, but I wanted to make the case that stumbling on the notion of compactness (or "linear transformation", or 'group') isn't that implausible.
One of the big problems I had when first learning math was that I thought all this stuff was handed down to Moses on stone tablets, and didn't realize that it arose far more organically. Perhaps one of the tip-offs was when I learned about topological spaces, and one of the classes of spaces was "T-2 1/2". It seemed pretty clear that someone skipped over something and then went back and filled in a spot that wasn't there by giving a "half-number" as a name. (This could well be wrong, but it's sure how it looked to a beginner!)