I'm reading the proof of Prop. A.3 in Hatcher's Algebraic Topology, which states that CW complexes are normal. Here's the statement and Hatcher's proof:
(The only detail of the inductive process mentioned in the proof relevant to my question is that $N_\epsilon^n(A)$ is a neighborhood of $A \cap X^n$ in $X^n$ and similarly for $B$.) I'm having some trouble with the sentence beginning "For a characteristic map..." starting on line 4 of the proof. Here are my questions:
What's the definition of the distance between two sets? I'm only familiar with the distance between a point and a set, and I'm assuming that the distance between subsets $A,B \subset X$, with $X$ a metric space, is just $\inf_{a\in A, b \in B} d(a,b)$.
Why is it necessary to talk about convergent sequences? What's wrong with this argument: If the distance between $\Phi_\alpha^{-1}(N_\epsilon^n)$ and $\Phi_\alpha^{-1}(B)$ is $0$, then since the latter set is compact we can find a point $x \in \Phi^{-1}(B)$ such that $d(x, \Phi_\alpha^{-1}(N_\epsilon^n)) = 0$. But $x$ has to be in $\partial D^{n+1}$ because all points in $\operatorname{Int} D^{n+1}$ are a positive distance from $\partial D^{n+1}$ and thus from $\Phi_\alpha^{-1}(N_\epsilon^n)$. But this is impossible because we can separate $x$ from $\Phi_\alpha^{-1}(N_\epsilon^n)$ by an open neighborhood in $\partial D^{n+1}$ (as stated in the proof). I just don't see why we need a convergent sequence when the important information is the point $x$ that is itself distance zero, and we don't even need a convergent sequence to find the point $x$.
