I'm trying to follow the proof that is given in the "Naive Set Theory" by Halmos that each well ordered set is similar to a unique ordinal number (which Halmos calls the Counting Theorem):
Counting Theorem. Each well ordered set is similar to a unique ordinal number.
Since for ordinal numbers similarity is the same as equality, uniqueness is obvious. Suppose now that $X$ is a well ordered set and suppose that an element $a$ of $X$ is such that the initial segment determined by each predecessor of $a$ is similar to some (necessarily unique) ordinal number. If $S(x, \alpha)$ is the sentence that says "$\alpha$ is an ordinal number and $s(x)$ is similar to $\alpha$", then, for each $x$ in $s(a)$, the set $\{\alpha: S(x, \alpha)\}$ can be formed; in fact, that set is a singleton. The axiom of substitution implies the existence of a set consisting exactly of the ordinal numbers similar to the initial segments determined by the predecessors of $a$. It follows, whether $a$ is the immediate successor of one of its predecessors or the supremum of them all, that $s(a)$ is similar to an ordinal number. This argument prepares the way for an application of the prinicple of transfinite induction; the conclusion is that each initial segment in $X$ is similar to some ordinal number. This fact, in turn, justifies another application of the axiom of substitution, just like the one made above; the final conclusion is, as desired, that $X$ is similar to some ordinal number.
I cannot understand what is the justification of the step: "It follows, whether $a$ is the immediate successor of one of its predecessors or the supremum of them all, that $s(a)$ is similar to an ordinal number."
I understand the construction of the set $\{\alpha: S(x, \alpha)\}$, but I only see that each element of that set is an ordinal number which is similar to some of the initial segments of $s(a)$. I also see that they are ordered by continuation (but so is any set of ordered numbers). From the step that puzzles me, I expect that the said set has to be an ordinal number, or at least to be similar to an ordinal number, but I can't see how this can be shown (or how this can be obvious).
PS. As far as I know, some of the terms that Halmos uses have modern equivalents: similar = order-isomorphic Axiom of Substitution = Axiom of Replacement
Halmos also has proven a fact hat every set of ordinal numbers has a supremum in the text just before that. This might be helpful, as $\{\alpha: S(x, \alpha)\}$ is definitely a set of ordinal numbers. So it must have a supremum, but what is it to do next, I don't understand.
UPDATE
After asking the question I think I pinpointed the part which is difficult to me.
Halmos asserts the fact that $s(a)$ is similar to an ordered number whether $a$ is the immediate successor of one of its predecessors, or the supremum of them.
In case $a$ is the immediate successor, it's easy to see that $s(a)$ is similar to some ordered number. There should be some $b \in s(a)$ such that $a$ is the immediate successor of $b$. Then $s(b)$ is similar to some ordinal number, say $\beta$ and $s(a)$ is similar to $\beta^+$. But I don't understand how to show the same if $a$ is the supremum of $s(a)$.
I prefer the following development, which is done in ZF minus the axioms of Infinity and Foundation (a.k.a. Regularity):
(1). Well-Orders. For a well-order $<$ on a set $A$ we define an initial segment of $A$ as $\{b\in A: b<a\}$ for some (any) $a\in A.$ And write $pred_< a=\{b\in A: b<a\}.$ ("pred" for predecessors).
(1.1). The only isomorphism of a well-order to itself is the identity.
(1.2). There is at most one isomorphism from one well-order to another.
(1.3) A well-order is not isomorphic to an initial segment of itself.
(1.4). Distinct initial segments are not isomorphic to each other.
(1.5). The Main Theorem. Trichotomy.
For any well-orders $A, B$ exactly one of the following is true:
(i). $A$ is isomorphic to $B.$
(ii). $A$ is isomorphic to an initial segment of $B.$
(iii). $B$ is isomorphic to an initial segment of $A.$
Proof of (1.5): Let $<_A, <_B$ be the well-ordering relations on $A, B$ respectively. Let $A^*$ be the set of those $a\in A$ for which there is an isomorphism $f_a:pred_{<_A}\to pred_{<_B}b$ for some $b\in B.$
(i'). By (1.3), $b$ is unique, and by that and (1.2), $f_a$ is unique. So let $g(a)=b$ (By Replacement applied to $A^*$). So for $a\in A^*,$ we have $f_a:pred_{<_A}\to pred_{<_B}g(a).$
(ii'). Let $a\in A^*$ and $a'<_A a.$ The restriction of $f_a$ to $pred_{<_A}a'$ is an isomorphism to an initial segment $pred_{<_B}b'$ of $B,$ so $a'\in A^*.$ Also, by (i') and (1.3) and (1.2) we have $b'=g(a')$ and the restriction of $f_a$ to $pred_{<_A}a'$ is $f_{a'}.$ It should be clear that $g(a')<_B g(a).$
Furthermore, if $b''<_B g(a)$ then because $f_a$ is an isomorphism, the set $(f_a)^{-1}pred_{<_B}b''$ is equal to $pred_{<_A}a''$ for some $a''\le_A a,$ so $b''=g(a'').$
So $A^*$ is either $A$ or an initial segment of $A,$ and the set $G=\{g(a):a\in A^*\}$ is either $B$ or is an initial segment of $B,$ and $g:A^*\to G$ is an isomorphism.
(iii'). (Case One). If $A^*=A$ then (i) or (ii) of the Theorem is true,but not both, by (1.3). And (iii) cannot be true because if $h$ was an isomorphism of $B$ to an initial segment of $A$ then $(h|_G)\circ g$ would be an isomorphism from $A$ to an initial segment of $A,$ contrary to (1.3).
(iv'). (Case Two). If $A^*=pred_{<_A}\alpha$ then $G=B.$ Otherwise, if $G=pred_{<_B}\beta,$ then $g:pred_{<_A}\alpha\to pred_{<_B}\beta$ is an isomorphism. But then the definition of $A^*$ implies $\alpha \in A^*,$ which is absurd because $A^*=pred_{<_A}\alpha.$
So if $A^*$ is an initial segment of $A$ then (iii) of the Theorem is true, and by applications of (1.3), (i) and (ii) are false.
(2). Ordinals. Let $<_A$ be a well-order on $A .$ To prove $A$ is isomorphic to an ordinal:
Case I: If there exists ordinal $B$ such that (i) holds, we are done.
Case II: If there exits ordinal $B$ such that (ii) holds, then since an initial segment of an ordinal is also an ordinal, we are done.
Case III. If cases I and II never hold then (iii) holds for every $B\in On$ (every ordinal $B$). Now for any $a\in A$ the set $pred_{<_A}a$ is isomorphic to at most one $B\in On$ by (1.3) because if $B,B'$ are distinct ordinals then one of them is an initial segmnt of the other. So for $a\in A$ let $h(a)=B$ if $pred_{<_A}a$ is isomorphic to $B\in On,$ and $h(a)=\emptyset$ otherwise. By Replacement the set $\{h(a):a\in A\}=On$ exists, but this is impossible. So Case I or Case II must apply and we are done.
(3).Remarks.
(3a). Lemmas (1.1),(1.2), (1.3) and (1.4) are easy to prove by contradiction. E.g. for (1.1) suppose $f:A\to A$ is an isomorphism and $a_0$ is the least $a\in A$ such that $f(a)\ne a.$ And we use (1.1) to prove (1.2). For (1.3) suppose $f:A\to pred_<a$ is an isomorphism . Then $f(a)<a,$ so consider the least $\alpha \in A$ such that $f(\alpha)\ne \alpha$.....Let me know if you need help with them.
(3b). Remember that in the language of Set Theory in use here, $B\in On$ does not assert there exists a set $On$. It is an abbreviation for "$\forall b\in B\,(b\subset B),$ and $B$ is well-ordered by $\in$".