I apologize if the title to this post is not too clear. I was reading Jech's Set Theory and I came across Theorem 2.8: "If $W_1$ and $W_2$ are well-ordered sets, then exactly one of the following three holds: (i) $W_1$ is isomorphic to $W_2$; (ii) $W_1$ is isomorphic to an initial segment of $W_2$; (iii) $W_2$ is isomorphic to an initial segment of $W_1$."
I tried to produce a proof on my own and I came across the following argument, which I know to be sketchy at best but which I do wonder if can be in any way formalized. My idea was basically: Consider $W_1$ and grab its least element, $w_0$, our isomorphism, $f$, shall then contain the pair $(w_0, w'_0)$, with $w'_0$ the least element in $W_2$. Now, consider $W_1 \setminus w_0$, and grab its least element, $w_1$. $f$ shall now also contain $(w_1, w'_1)$, with $w'_1$ the least element in $W_2 \setminus w'_0$.
This process of "pairing" could continue indefinitely, adding pair after pair to our function $f$ such that it is still order-preserving. I would say that, as each element of $w \in W_1$ is the least element of $W_1 \setminus \{x \in W_1 : x < w\}$, each pair will eventually be considered, but this is clearly false as such an intuition seems only to hold for sets of the "countable" type where one may reasonably expect to "eventually" reach the desired $w$.
Thus, there is the crux of my issue. Such an argument could only possibly hold for countable $W_1$, right? The moment $W_1$ is uncountable, one stops being able to "iteratively" produce such constructions (transfinitely?), and similarly stops being able to produce such arguments of the sort "I construct $S_0$ first, then $S_1$, then... $\longrightarrow$ my solution is $S = \bigcup S_n$."
Is this sort of uncountably infinite construction strategy ever viable? Can it perhaps be formalized via the Axiom of Choice? Perhaps one could justify the existence of the set of pairs $(w_i, w'_i)$ with the above described property (each is, in some way, the $i$th least element of its respective set, all other $i - 1$ least elements removed) through the existence of a choice function acting over the subset ($\in \mathcal{P}(W_1 \times W_2)$) containing exactly such pairs?
I know these questions are probably really dumb or misguided at best, but I am just starting on the topic and honestly I still am yet to fully wrap my head around axioms like Choice or Replacement and when (and when not) one is able to perform these sort of "infinitary" arguments.
Many thanks in advance for any and all answers!
Your intuition for how this works is correct here.
We match up the least element $w_0\in W_1$ to the least element $v_0\in W_2,$ then match up the least element $w_1\in W_1-\{w_0\}$ to the least element $v_1\in W_2-\{v_0\}$, "and so on". Eventually, we'll run out of elements of one or the other sets and we'll have an initial segment embedding of one into the other (or an isomorphism if they both run out at the same stage).
We needn't stop after countably many stages: we can just match up the least element of $W_1-\{w_0,w_1,\ldots\}$ to the least element of $W_2-\{v_0,v_1,\ldots\}$ and keep going.
The problem is in making this idea completely rigorous. This is most naturally viewed as construction by transfinite recursion on the ordinals. For example, we'd want to call the pair in the previous paragraph $w_\omega,v_\omega,$ and the next one $w_{\omega+1}, v_{\omega+1}.$
But Jech doesn't cover transfinite recursion for a few more pages. And this rigidity theorem is fundamental, in that it is used at several points before then. So this necessitates the somewhat more obscure proof that Jech gives. Nonetheless, if you read the proof carefully, you'll see that it is essentially the same idea, only taking care to avoid the informal hand-wavy argument that will later be justifed through transfinite recursion.
On a side note, Jech's book probably isn't the best place to learn the basics, since it can be a bit brief. I'd suggest either Hrbacek/Jech or Devlin's "The Joy of Sets".