Could I have come up with the definition of Compactness (and Connectedness)?

Question

Could I have come up with the definition of Compactness (and Connectedness)?

1.5k Views Asked by Bumbble Comm At 10 May 2026 - 1:50

Ok, buckle up for a rather long question. I've spent a large portion of today learning about compactness, stemming mainly from this wikipedia article about point-set topology. The article mentions three main things: continuity, connectedness, and compactness. I will address each in turn, but my question is mainly about the last one.

Continuity: At least having gone through high school calculus and basic university analysis, I think people have a great intuitive understanding of continuity (and also differentiability I guess): smooth = yay!, jagged = fine-ish, holes/jumps = really bad :(. The wiki article describes this as "taking nearby points to nearby points", which I understand well enough that I think given a decade or something I could eventually have come up with the formal epsilon-delta definition for continuity: $$\forall \varepsilon >0, \exists \delta, \text{ s.t. } |x-x_{0}|<\delta \Rightarrow |f(x)-f(x_{0})|<\varepsilon $$

Connectedness: Similarly, I think people have a great intuition for connectedness (at least path connectedness), which the wiki article summarizes nicely as "sets that cannot be divided into two pieces that are far apart". Again, I think that given a decade or two I could have gone in the right direction at least for the formal definition of connectedness: a set that can't be represented as the union of two or more disjoint non-empty open subsets.

First (Minor) Question: could we have a useful definition for connectedness being: "a set that can't be represented as the union of two or more disjoint non-empty closed subsets"?

Similarly, I feel like I could reasonably develop definitions for open sets (based on intuitions from number lines and basic high-school algebra/set theory), and completeness (basically the Least Upper Bound axiom/Cauchy sequences). However, there's one thing missing. Compactness. Never in a million years do I think I could have thought up of the definition that a set that can be "can be covered by finitely many sets of arbitrarily small size." I've looked at these five sites and some links therein:

but none of them so far has really clicked with me. Many people emphasized that it's a generalized version of finiteness with "fat blurry points", and I also understand that by the Heine-Borel Theorem compactness is equivalent to "closed and bounded" in Euclidean space, but those two things seem so far apart that it just seems like a black-magic coincidence that they describe the same phenomenon.

How would you motivate and explain the definition and concept of compactness to your students in such a way that they feel like they could have come up with it themselves, naturally, given a decade or two?

If you start with "it's a generalized version of finiteness" it seems like a complete and utter coincidence that it happens to be equivalent to "closed and bounded". I mean of all the possible "generalized finiteness formulations", how did ours get it right?

If you start with "it's just another way of saying closed and bounded", then students will feel that it's just more arbitrary confusion redefining things they already know (namely that of closed-ness and boundedness); furthermore, even if they did accept this explanation, they would have never figured out on their own that "every open cover has a finite subcover $\iff$ closed and bounded". "Finite subcover" just seems so out-of-left-field.

And finally if you go the sequential compactness route (referring to Tao's paper here) students will just say "ahh yes, the Bolzano-Weierstrass Theorem! Why need to give it a new name of compactness?"

Maybe I've missed something in my searches, but I hope this question isn't just a badly rehashed old question. I don't think my question is answered in the "Pedagogical History of Compactness" mainly because I don't want the convoluted history, but rather the simplest motivation and explanation based off modern curriculum and notation.

-> Also, thank you to those who've commented/left an answer. I hope that this page and all the differing pedagogical interpretations presented will serve as a relatively complete and comprehensive guide for beginners learning compactness in the future. Please upvote answers you think are particularly insightful; as a novice, I would appreciate some expert judgement on the explanatory power of these answers.

Original Q&A

There are 6 best solutions below

Bumbble Comm On 06 Aug 2019 - 8:33

I like John Hughes' answer, but I'll try to make my own stab at it. I will also go on a rather long rant, so make sure you have some time if you're reading this. If I'm not mistaken, the minor problem concerning connectedness has been solved in the comments.

Before trying to see whether you could have come up with the concept of compactness yourself, you should try to see what specific aspect of compactness you're interested in, what "intuitive" property is represented by it - if that property is "from any covering you can extract a finite subcovering" then of course you could have discovered it yourself, but it's not a very intuitive property, so that's not very interesting.

So first let's figure out what we mean and what we want with the concept of compactness (for instance, this might help us in figuring out why it's called "compact").

Is it the Bolzano-Weierstrass property that we're trying to generalize to other spaces where we noticed it didn't go as well ? Is it the "closed and bounded" property ? Is it a generalization of finiteness ? Or is it just "something that shares the 'compact' properties of $[0,1]$" ? Or perhaps "property that says that you don't go off to infinity" ?

I could give a different story according to what we're actually interested in, but to me the most intuitive route would be Bolzano-Weierstrass : my other favourite is "don't go off to infinity" (which is closely related to "finiteness" of course), and I can add a few words on that if you want, but I'll start with Bolzano-Weierstrass because I think that's what would convince most students that compactness is an interesting notion : the BW-theorem is such a powerful theorem in analysis and you can prove many great things with it, it only makes sense that we would want to see what it looks like more generally.

See the end for a "tldr"

Moreover I will take a somewhat deviant route from BW to compactness which is not the one that's usually presented to students (at least that was presented to me and my friends). This will also be a completely fictional story.

You're a young mathematician and you've learned a bunch of stuff in analysis and you've noticed this marvelous BW property that $[0,1]$ has, that any sequence has a convergent subsequence. You've also noticed, along your many encounters with analysis, that sequences tend to be a very important tool in studying real functions, or even subsets of the euclidean space.

In fact, you notice that anything seems to be determined by sequences, which makes this BW property so much more interesting : continuity can be determined by looking at sequences, so can "being the complement of an open set" (the notion of open set being quite natural : it's a set that contains all the points close enough to all of its points): you're the complement of an open set if and only if any convergent sequence that lies in you converges in you. You've used the BW property a couple of times here and there to prove that such function is continuous, or that such function has that value, or is extendable etc.

One day your colleague comes to you with a thing they claim is a 'space'. With modern definitions, this space is $\beta \mathbb N$, the space of ultrafilters on $\mathbb N$ [it's not important if you don't know what they are - you can do a similar example with many spaces, such as $\omega_1+1$ if you know what that is]. They claim that understanding this space is important for such and such reason. So you start looking at it for a while and you prove two theorems that are of interest to your colleagues. You prove the first one on monday, and by friday you've forgotten its precise statement and proved the second one, without thinking of the first one, which you thought was "just a technicality". By saturday, you see both theorems side to side on your way to your colleague's and you're stumped, because they seem to condtradict eachother !

The first theorem is : any sequence of principal ultrafilters [a special kind of ultrafilter, it's not important to know what they are] that converges in $\beta \mathbb N$ is eventually constant - in particular it converges to a principal ultrafilter.

The second theorem is : any open neighbourhood of any ultrafilter contains a principal ultrafilter [there are nonprincipal ultrafilters]

Huh. You're very much used to analysis and so you think there's a problem but neither you nor your colleague can see the mistake in your proofs. Then you start to doubt yourself and go over your analysis knowledge : you try to see why, in $\mathbb R$, theorem 2 would imply that any ultrafilter has a convergent sequence of principal ultrafilters to it. You realize that you're using the following property : there is a countable sequence of neighbourhoods of any point such that any neighbourhood contains one of them. Hah ! this is not true in $\beta \mathbb N$ !

You're thinking of saying that $\beta \mathbb N$ is therefore just a pathology that you should ignore but your colleague keeps telling you that it's very important to their work. In particular they need the following result : any continuous function $\beta\mathbb N\to \mathbb R$ is bounded. In your world, you could use sequences to approach that, but you now see that sequences can't solve everything in "wilder" spaces, you have to think of something new.

Now the argument that the second theorem should contradict the first one doesn't work with sequences, but what if you change the meaning of sequence ? After all, if you index points by the neighbourhoods of the point you're trying to approximate, then suddenly you get something that really approximates it.

Mhm but note that this last thing is independent of $\beta\mathbb N$ : what if you replaced sequences in your work by a more general notion of sequence ? Something that can be indexed by a more general object than $\mathbb N$ ?

You work following this path and discover the notion of "nets" and work out many of their properties. You see that they seem to generalize sequences, and they have analogous properties, even in pathological spaces like $\beta\mathbb N$ ! For instance continuity of a function can be determined by looking at nets, complements of open sets can be characterized by nets as well etc.

You're happy because you destroyed the pathologies by going from the notion of "sequence" which was biased towards $\mathbb N$ to the notion of "nets" which was more general, and almost as easily workable. Now has come the time to test out your theory : what does the BW property look like with nets ?

Well working some more on the example of $\beta \mathbb N$ (which your colleague has told you should have the analogous property, for the results they need and believe to be true) convinces you that you can't just take a subset of the indexing order to get your extraction property, so you need something more subtle. At this point you discover the notion of subnet and define the analogous BW property for subnets.

With some work, you prove that $\beta \mathbb N$ does indeed have that anlogous property and so your colleague can safely go on with their research.

But you're not fully satisfied : sure the BW property with nets is nice and all, but it doesn't seem to be an intrinsic characterization (in euclidean space we have the "closed bounded" characterization which is purely intrinsic). At this point you've noticed that a lot of properties about nets can be proved by taking as indexing sets some sets of neighbourhoods and so you play around with that, and soon enough you find the intrinsic characterization : indeed you assume that you have some net $(x_i)_{i\in I}$, and you want to force convergence of some subnet to, say, $x$. Then take the set of pairs $(i,V)$ where $i\in I$ and $V$ is a neighbourhood of $x$ with $x_i\in V$ (a standard trick you will have learnt by working on nets !). You have an obvious associated subnet and it should converge to $x$, unless you have some neighbourhood $V_x$ with no $x_i$ beyond some $i_x$ in it.

Then, if your net in fact has no converging subnet, this happens for every $x\in X$ so you have a whole gallery of opens $V_x$. Now you play around with this for a while : take $x\in X$, then beyond $i_x$ no one is in $V_x$. Where is "$i_x+1$" (which doesn't make sense, but you're just playing so you allow it) ? it's in some $V_y$, but then not after $i_y$, and so after that they're in $V_z$, but not after $i_z$, etc. etc.

This last "etc. etc." is interesting because you start wondering : "hey ! the problem is that this 'etc. etc.' is infinite - if the process stopped at some point, I would get a contradiction, so my net would have a converging subnet !". OK but this is a given net. The cover $(V_x)$ can be pretty much as wild as you'd like if the net varies, the only thing that doesn't change is : it covers the whole space.

So to make sure that every net has a converging subnet, you need to ensure that for any (wild or not) cover, the process stops. What this means is precisely that there's a finite subcover. Now you say "well from what I've done, it's pretty clear [i.e. I will find a proof soon] that if I have this weird property on covers, I have my property on nets !"; and after thinking a bit you use again one of the usual tricks to go from a cover to a net to see that there's a converse to that statement : you found your intrinsic characterization of the generalize BW property.

Now you prove (way more easily) that $\beta\mathbb N$ has this cover-property, so does $\omega_1+1$, etc. and you reprove the equivalence in the special case of sequences for euclidean space (or in fact, metric spaces).

Unfortunately along the way we haven't learnt why it's called compact. I think the "go off at infinity" point of view is the best to explain this name.

$\mathbf{tldr}$ : Now I've done something quite long, but I think the main point to remember is the following : the notion of compactness for general spaces can be seen as just a reformulation of the Bolzano-Weierstrass property in a context where we understand that sequences don't characterize everything in more general spaces. Seeing the equivalence between "net-BW" and "finite cover" is pretty much straightforward, the problem is going from BW to net-BW, that is, understanding why we go from sequences to nets (or filters, but I preferred going through nets here because they're more intuitive to students).

Note that, as opposed to sequences, nets do characterize everything in sight even in pathological spaces (continuity, closedness, compactness, etc.)

Bumbble Comm On 07 Aug 2019 - 7:28

This is my second answer, at the request of D.R.; for the beginning see the first one and the comments below : this one will explain how we can get the notion of compactness from the idea that a compact space is a space where you can't run of to infinity; "there's no infinity", or in other words "the space is 'finite' ". This answer is quite long too, so I wrote a (shorter and perhaps less informative than the first one) tldr at the end.

A first thing to understand is what it means to run of to infinity : notice that it can't be about distance or metric, because for instance $\mathbb R$ and $(0,1)$ are homeomorphic; and in fact morally it's quite clear that the sequence $1/n$ in $(0,1)$ "goes to (some) infinity".

Another example to see that distance and metric are not a reasonable way to formalize this "going off to infinity" (one I quite like, in fact analyzing this notion gives an elementary proof of a well-known theorem) is $\mathbb C^* = \mathbb R^2\setminus \{0\}$. In that space, you can go to infinity in two different directions : have your modulus increase to $\infty$, or decrease to $0$.

Note that in the "compact friends" of these spaces these issues are resolved : in $[0,1]$, $1/n$ gets closer and closer to $0$, not "far from everything"; in $D^2$ (the disk), if you get close to the boundary well you get close to the boundary, not far from things (same for the center); that's a good sign that our intuitive understanding has some connection to the notion of compactness.

In fact, in my examples I used sequences to exemplify running off at infinity, and that's why the BW property is relevant : if your sequence $(x_n)$ has the convergent subsequence $(x_{\varphi(n)})$, converging to $x$, then it can't be running off to infinity, because at least some of it is running towards $x$.

Let's look at an example where sequences don't do the trick (similarly to my other answer), which will explain the need for something different from BW : the space $\omega_1$, with again the order topology (or if you know what it is and like that better, you can use "the long line").

In this space, you can also run off to infinity, by picking larger and larger ordinals and never stopping. Of course if you do that following a sequence, then basic ordinal properties tell you that you're actually running towards something, and not infinity, but we do feel that there's room to run off to.

So if we want compactness to be something that a space has if "there's no way to escape to infinity", we must use something that's not defined by sequences. At this point I could branch out to my other answer and just say nets are the answer, but I'll try to go a different route and get to open covers more directly.

Picture $\mathbb R^2$ in your head. Intuitively, the only way to run off to infinity in this space is to have your norm increase a lot. This means that, as 'time' goes by, you'll be getting out of more and more balls of the form $B(0,r)$, $r>0$. One can even take that as a heuristic definition of running off to $\infty$ in $\mathbb R^2$ : for each $r$, you'll be out of $B(0,r)$ at some point (in 'time', but we don't want to specify what we mean precisely by time, that has a strong chance of being restrictive - except if we use arbitrary partial orders, but in that case we end up back to nets). But of course, at each fixed moment in time, you're in some $B(0,r)$ : the fact that you're able to run off is somehow related to the increasing 'sequence' $B(0,r)$.

What about, say, $(0,1)$ ? Well with a similar analysis, you see that you can only run off to $0$ and $1$ (anything inbetween is, well, in $(0,1)$ so you're running towards something, not away from everything); and so it has to do with getting out of the $(a,b)$'s, with $a,b\in (0,1)$.

But this example we knew how to explain with sequences; what about $\omega_1$ [you can again use the long line] ? In this one, sequences can't explain our feeling, but the idea is simply that if you give me $\alpha <\omega_1$, I can run off above it in a split second, but of course at any given time I am stuck below some $\alpha < \omega_1$. So I am always in some $[0,\alpha)$, but I can always go beyond these. So the problem is with the 'sequence' of $[0,\alpha)$'s. Starting to see a theme?

We see that we are able to run off to infinity when there are some subsets $X_i$ of our space that, in a sense, are everything at every given moment, but you can always go beyond them. What kind of subsets should we take ? Our previous examples suggest that they should be open, but isn't that just a coincidence ?

I could have take the closed balls in the first example, the closed intervals $[a,b]$ in the second and $[0,\alpha)$ is closed in $\omega_1$ anyway. But would these have been reasonable choices ?

Running away from everyone means that if $x$ is someone, at some point you'll be far from $x$. But neighbourhoods of $x$ (or open sets containing $x$) symbolize what is close to $x$. To be far from $x$, certainly you should be away from these.

So if your $X_i$ are your subsets that model the "running away", for each $x$ there should be some $i$ such that not being in $X_i$ ensures that you're far from $x$. The best way to do that is to make sure that the interiors of the $X_i$ cover the space.

To simplify things, we might as well say that the $X_i$ are open; and for that reason let's call them $U_i$. So now we have modeled a situation of "trying to run away" by an open cover $(U_i)$ of our space ! The question is now : in which of these situations can you actually run away ? What property of the space can we ask for to ensure that in every situation you can't run away ?

For instance look at the following in $\mathbb R^2$ : I take $U_0$ to be the right open half-plane, $U_1$ to be the left open half-plane, and $U_2$ to be an open vertical (infinite) band centered around $0$. By our previous analysis, these model a "situation of trying to run away", because they cover the space. But if you try to run away from them, since there are only $3$ of them, you will return to at least one of them infinitely often, so you haven't run away from any of the points that were in that one.

Ah ! This works because there are $3$ of them, but if there were only $4,5$, or actually any finite number of them, the same reasoning would apply : I wouldn't really be running away. So for a "situation of trying to run away" to allow you to run away you need to make sure that there's an infinite number of opens in your cover.

But there needs to be really infinitely many of them : take the same situation as above with my $3$ open sets. I can add as many as I want (an open ball here, an open square there, oh and perhaps some other open half-planes etc.), the reasoning I made will still apply to the original $3$.

So if I give you an open cover which models a "trying to run away situation", when can you ensure that I can't make this reasoning ? At this point, the notion of finite subcover arises, and you quickly notice that if from any open cover I can extract a finite subcover, then I can never really run away.

But wait, maybe that's too strong ! I made one reasoning that happened to use finiteness to get the result I wanted, maybe there are other ways to ensure you can't run away.

Well let's take a counterexample : an open cover $(U_i)_{i\in I}$ that has no finite subcover. So for any finite subset $J\subset I$, $(U_i)_{i\in J}$ doesn't cover the space so I can find $x_J\notin \bigcup_{i\in J}U_i$. Now what is my "process" $(x_J)_J$ ? Well it's something that is running away from my situation, as $J$ increases. Indeed let's take any point $x$ of my space : $x\in U_{i_0}$ for some $i_0$. Then for any $J$ that contains $i_0$, $x_J\notin U_{i_0}$ : $x_J$ is far from $x$.

That means that given any point, my running away takes me far from it if I wait "long enough"; in other words I successfully ran away from everything.

Therefore we have it : open covers correspond to "trying to run away", and those in which you can are precisely those that have no finite subcover. In particular to ensure that you can never run away, you have to impose this (strong-looking) condition : any open cover has a finite subcover. That is the definition of compactness.

I suggest you play around with various open covers and try to see what my story corresponds to with these, you should get a better feel of why I'm saying what I'm saying (hopefully).

In particular note that this interpretation does give two very different ways to run off to infinity in $\mathbb R^2\setminus \{0\}$, as our heuristic understanding suggested : ine one of them you take the open cover $(\{x \mid ||x|| >r\})_{r<0}$ and in the other $(\{x \neq 0 \mid ||x||< r\})_{r>0}$; and if you think about it, you'll see that morally (I don't have a precise statement in mind although I'm sure I could cook up some if asked to) any "essentially infinite" cover (that is, a cover of $\mathbb R^2\setminus \{0\}$ of which no finite subcover can be extracted) is basically one of these two, or a combination thereof.

If you formalize one (more advanced, that's why I said I didn't have a precise statement in mind) interpretation of this, you get an elementary (meaning : no algebraic topology involved) proof that $\mathbb R^2\setminus \{0\}$ and $\mathbb R^2$ are not homeomorphic.

$\mathbf{tldr}$ : Compactness of a space means you can't run off to infinity. When you think about it hard enough, you see that running off to infinity can be modeled by an open cover with no finite subcover, as can be seen in the examples of $\mathbb R^2\setminus \{0\}, (0,1)$ or $\omega_1$; the definition of compactness follows at once.

Bumbble Comm On 12 Aug 2019 - 1:14

Instead of "Could $\text{*I*}$ ..." here we look at

Could Leibnitz, Newton and Cantor have up with the definition of compactness.

We have these three working together blending their theories together while working on something really practical - they are trying to make sense of the definite integral,

$$\tag 1 \int_a^bf(x)\,dx $$

where is $f$ is a continuous function over the closed interval $[a,b]$.

They 'know' how special this closed interval domain is since it is 'crystal clear' that they can calculate the area between the graph and the $x$ axis (at this point Cauchy shows up to keep them on track).

They like the fact that they can prove many general things using open sets, but finally they uncover the essence, defining what it means for a function to be uniformly continuous using the $\varepsilon / \delta$ technique.

They eventually prove this preliminary step (a.k.a the Heine–Cantor theorem) in their analysis of $\text{(1)}$ :

LEMMA 1 Continuity on a closed interval implies uniform continuity.

See also this.

They came up with the 'finite subcover idea'. They know that the open interval $(0,1)$ is connected and complete and that $[0,1]$ is also connected and complete, BUT it has this other property. They like alliterations so they decide to call $[0,1]$ a compact, connected, complete and closed interval.

It is the uniform continuity that allows them to argue that the summations over the (finer and finer) partitions of $[a,b]$ converge to a unique number

$$\tag 2 L = \lim_{\text{finer partitions}} \sum f \small \Delta_x = \int_a^bf(x)\,dx$$

Bumbble Comm On 16 Aug 2019 - 10:20

It is important to remember that definitions are not a universal thing given to us by higher powers, that we need to discover. They are also not something we come up with. They are choices made by humans out of all equivalent reformulations of the same property. Any equivalent reformulation can be considered as the definition of a property. Usually, we choose the one reformulation that is either the shortest, or the most convenient to work with. So your question is how would you figure out that certain reformulation of compactness is convenient? Before that you should also decide why you care about compactness and why it is an interesting property, so that we need to find a convenient definition for it.

Now about your notions:

Continuity: Let me note that your definition of continuity is for functions $\mathbb{R}\to \mathbb{R}$, but not in the full generality of topological spaces. It is worth mentioning that calculus existed many years before people came up with nice $\varepsilon-\delta$ definition of limits and defined derivatives in the way modern books do. The reason this definition sticked, is because it is convenient to work with. Well, for many arguments with continuity, it is even more convenient to use the topological definition (preimage of any open set is open). This definition certainly does not look natural to high schooler and I am not sure if it gives you more intuition than $\varepsilon-\delta$ one, or if it captures the "essence" of continuity. You might have discovered it if you were studying abstract topological spaces (but why would you do that?). The only way I could justify the topological definition is that after trying to work with it, you notice that this definition results in more elegant proofs than $\varepsilon-\delta$ one.

Compactness: similar story. To motivate that compactness is an important notion, you could define compact sets to be sets on which any continous function attains its minimum (certainly for subsets of $\mathbb{R^n}$ it is equivalent to compactness in the usual sense). Certainly, an important notion for someone who cares about minimizing functions from real life (which are continuous). Now, you may ask: is this the most important feature of compactness that captures all about it, that your students should always think about? Probably not, but I do not know how do you compare those things. The point is that being a compact set is equivalent to a billion different things, and for people with different interests one is more important than the other.

When you let a student prove that closed and bounded sets are such, they would most likely try to do it by extracting converging subsequences, so you see that the definition of "every sequence has converging subsequence" is useful. Then when you try to show other features of compact sets, you notice that you keep using the same argument of extracting subsequences. That's how one decides that this should be a definition of a compact set. Similarly to topological definition of continuity, I find it hard to justify why one should use this definition, before you try to go into the proof, and see that it is convenient. Then you do other proofs of things that you find important, and you see that many times you need to extract finite subcover. What would be the first such example that students see? It does not matter. So if you tried to answer questions related to compactness, I am sure you would have discovered the argument with extracting finite subcovers, and after you have done it many times you could have thought to just use it as definition of compactness.

Since we do not know all equivalent reformulations of being a compact set, it is possible that we have not yet discovered the best definition which will spill the most light into what compactness is about. So could you discover it? I don't know...

Just to give one example, think of the following very basic fact: image of any compact set under continuous function is compact. Try to prove it using various definitions (of compactness and of continuity), and see which one is more elegant.

Another nice example is amenable groups. They have many equivalent definitions (surely in double digits, but possibly more than 100). Each new definition is a theorem. Many definitions are very natural when you study one property of them, but useless when you study other properties. Many mathematicians have intuition about one definition but not others, depending on their areas of expertise. Could you or anyone else find discover all/some of these definitions/theorems? If you were interested in a problem, such that the relevant definition arises in the context of that problem and you have decent skills, then the answer is yes.

Bumbble Comm On 16 Aug 2019 - 8:44

Let me give an example drawn from real life: I have led high-school students to conjecture and prove the compactness of the unit interval.

The context was in a development of elementary measure theory. After developing the definition of measure, a proposed measure was the Jordan content: given a set $A \subseteq \mathbb{R}$, $$J(A) = \inf \left\{ \sum_{i=1}^n (b_i - a_i) : A \subseteq \bigcup_{i=1}^n (a_i,b_i)\right\},$$ where the infimum is over all such coverings of $A$ by intervals. Of course, this is not a measure, since it cannot satisfy countable additivity. This leads to the definition of outer measure: $$ m^\ast(A) = \inf \left\{ \sum_{i=1}^\infty (b_i - a_i) : A \subseteq \bigcup_{i=1}^\infty (a_i,b_i)\right\}.$$ Now one would like to show this is a translation-invariant measure on (measurable) subsets of $\mathbb{R}$. Further, it should have the property $$m^\ast([0,1]) = 1,$$ which was true of Jordan content! (This can be proved by induction on the size of a cover.) But there are infinite covers of the interval: can they have total length less than 1? When students try and fail to construct such a cover, they are trying to construct infinite covers with no subcover. They eventually will conjecture compactness in the following form:

If $[0,1] \subseteq \cup_{i=1}^\infty (a_i,b_i)$, then there is an $N$ such that $[0,1] \subseteq \cup_{i=1}^N (a_i,b_i)$.

In this context, the burden of motivation shifts from compactness to countable additivity--why is that a reasonable condition for a notion of length to satisfy? But this can also be done: we believe $\mathbb{Q}$ has less length than $\mathbb{R}\setminus \mathbb{Q}$, although neither of their characteristic functions are Riemann-integrable.

In sum, compactness can be discovered when there is a need for it.

**Bumbble Comm** · Accepted Answer

I'm going to make a stab at "compactness" here. Suppose you want to prove something about sets in, say, a metric space. You'd like to, say, define the "distance" between a pair of sets $A$ and $B$. You've thought about this question for, say, finite sets of real numbers, and things worked out OK, and you're hoping to generalize. So you say something like "I'll just take all points in $A$ and all points in $B$ and look at $d(a, b)$ for each of those, and then take the min."

But then you realize that "min" might be a problem, because the set of $(a,b)$-pairs might be infinite -- even uncountably infinite, but "min" is only defined for finite sets.

But you've encountered this before, and you say "Oh...I'll just replace this with "inf" the way I'm used to!" That's a good choice. But now something awkward happens: you find yourself with a pair of sets $A$ and $B$ whose distance is zero, but which share no points. You'd figured that in analogy with the finite-subsets-of-$\Bbb R$, distance-zero would be "some point is in both sets", but that's just not true.

Then you think a bit, and realize that if $A$ is the set of all negative reals, and $B$ is the set of positive reals, the "distance" between them is zero (according to your definition), but ...there's no overlap. This isn't some weird metric-space thing ... it's happening even in $\Bbb R$. And you can SEE what the problem is --- it's the "almost getting to zero" problems, because $A$ and $B$ are open.

So you back up and say "Look, I'm gonna define this notion only for closed sets; that'll fix this stupid problem once and for all!"

And then someone says "Let $A$ be the $x$-axis in $\Bbb R^2$ and let $B$ be the graph of $y = e^{-x}$." And you realize that these are both closed sets, and they don't intersect, but the distance you've defined is still zero. Damnit!

You look more closely, and you realize the problem is with $\{ d(a, b) \mid a \in A, b \in B\}$. That set is an infinite set of positive numbers, but the inf still manages to be zero. If it were a finite set, the inf (or the min -- same thing in that case!) would be positive, and everything would work out the way it was supposed to.

Still looking at $A$ and $B$, instead of looking at all point in $A$ and $B$, you could say "Look, if $B$ is at distance $q$ from $A$, then around any point of $B$, I should be able to place an (open) ball of radius $q$ without hitting $A$. How 'bout I rethink things, and say this instead: consider, for all points $b \in B$, the largest $r$ such that $B_r(b) \cap A = \emptyset$...and then I'll just take the smallest of these "radii" as the distance.

Of course, that still doesn't work: the set of radii, being infinite, might still have zero as its inf. But what if you could somehow pick just finitely many of them? Then you could take a min and get a positive number.

Now, that exact approach doesn't really work, but something pretty close does work, and situations just like that keep coming up: you've got an infinite collection of open balls, and want to take the minimum radius, but "min" has to be "inf" and it might be zero. At some point, you say "Oh, hell. This proof isn't working, and something like that graph-and-$x$-axis problem keeps messing me up. How 'bout I just restate the claim and say that I'm only doing this for sets where my infinite collection of open sets can always be reduced to a finite collection?"

Your skeptical colleague from across the hall comes by and you explain your idea, and colleague says "You're restricting your theorem to these 'special' sets, ones where every covering by open sets has a finite subcover .. .that seems like a pretty extreme restriction. Are there actually any sets with that property?"

And you go off and work for a while and convince yourself that the unit interval has that property. And then you realize that in fact if $X$ is special and $f$ is continuous, then $f(X)$ is also special, so suddenly you've got tons of examples, and you can tell your colleague that you're not just messing around with the empty set. But the colleague then asks, "Well, OK. So there are lots of these. But this finite-subcover stuff seems pretty...weird. Is there some equivalent characterization of these special sets?"

It turns out that there's not -- the "change infinite into finite" is really the secret sauce. But in some cases -- like for "subsets of $\Bbb R^n$ -- there is an equivalent characterization, namely "closed and bounded". Well, that's something everyone can understand, and it's a pretty reasonable kind of set, so you need a word. Is "compact" the word I'd have chosen? Probably not. But it certainly matches up with the "bounded"-ness, and it's not such a bad word, so it sticks.

The key thing here is that the idea of compactness arises because of multiple instances of people trying to do stuff and finding it'd all work out better if they could just replace a cover by a finite cover, often so that they can take a "min" of some kind. And once something gets used enough, it gets a name.

[Of course, my "history" here is all fiction, but there are plenty of cases of this sort of thing getting named. Phrases like "in general position", for instance, arise to keep us out of the weeds of endless special cases that are arbitrarily near to perfectly nice cases.]

Sorry for the long and rambling discourse, but I wanted to make the case that stumbling on the notion of compactness (or "linear transformation", or 'group') isn't that implausible.

One of the big problems I had when first learning math was that I thought all this stuff was handed down to Moses on stone tablets, and didn't realize that it arose far more organically. Perhaps one of the tip-offs was when I learned about topological spaces, and one of the classes of spaces was "T-2 1/2". It seemed pretty clear that someone skipped over something and then went back and filled in a spot that wasn't there by giving a "half-number" as a name. (This could well be wrong, but it's sure how it looked to a beginner!)

Could I have come up with the definition of Compactness (and Connectedness)?

There are 6 best solutions below

Related Questions in GENERAL-TOPOLOGY

Related Questions in SOFT-QUESTION

Related Questions in COMPACTNESS

Related Questions in CONNECTEDNESS

Related Questions in MOTIVATION

Trending Questions

Popular # Hahtags

Popular Questions

Could *I* have come up with the definition of Compactness (and Connectedness)?

There are 6 best solutions below

Related Questions in GENERAL-TOPOLOGY

Related Questions in SOFT-QUESTION

Related Questions in COMPACTNESS

Related Questions in CONNECTEDNESS

Related Questions in MOTIVATION

Trending Questions

Popular # Hahtags

Popular Questions

Could I have come up with the definition of Compactness (and Connectedness)?