Let $A$ be an arbitrary set. We will consider pairs $(B, G)$, where $B$ is a subset of $A$, and $G$ is an order relation in $B$ which well-orders $B$.
Let $\mathscr{A}$ be the family of all such pairs $(B, G)$. We introduce the symbol and define $(B, G)\prec (B′,G′)$ if and only if,
- $B\subseteq B'.$
- $G\subseteq G'.$
- $x\in B$ and $y\in B'-B\Rightarrow (x,y)\in G'$.
Lemma 1.0: Let $$\mathscr{C}=\{(B_i,G_i)\}_{i\in I}$$ be chain of $\mathscr{A}$, let $$B=\bigcup_{i\in I}B_i\quad\text{and}\quad G=\bigcup_{i\in I}G_i.$$ Then $(B,G)\in \mathscr{A}.$
Lemma 1.1: If $\mathscr{C},B,G$ are defined as above, $(B, G)$ is an upper bound of $\mathscr{C}$.
Proof: Let $(B_i,G_i) \in \mathscr{C}$; clearly $B_i\subseteq B$ and $G_i\subseteq G$. Now suppose that $x\in B_i$, $y \in B$, and $y\notin B_i$; certainly $y\in B_j$ for some $j\in I$. So $(B_j, G_j) \nprec(B_i,G_i)$, hence $(B_i ,G_i)\prec (B_j, G_j)$. Now $x \in B_i$ and $y \in (B_j − B_i )$, so, $(x, y) \in G_j \subseteq G$. Thus $(B_i,G_i)\prec (B, G)$.
Commentary: From this proof, what is going to be demonstrated there is that given any two elements of $ \mathscr{C} $ (In this case its elements are $ (B_i, G_i) $), then those two elements can be compared with $ (B, G) $ under the operation defined above (or we are going to denote them arbitrarily less that $ (B, G) $), am I correct? And the other thing is the piece that says, "So $(B_j, G_j) \nprec(B_i,G_i)$, hence $(B_i ,G_i)\prec (B_j, G_j)$", I don't understand why that is followed.
Theorem: Any set $A$ can be well ordered.
Proof: By lemma 1.0 and 1.1, we can apply Zorn’s Lemma to $\mathscr{A}$ ; thus has a maximal element $(B,G)$. We will show that $B = A$; hence $A$ can be well-ordered. Otherwise, $\exists x ∈ (A − B)$; by defining $x$ to be greater than each element of $B$, we get an extension $G^∗$ of $G$ that well-orders $B \cup \{x\}$. (More explicitly, $G^∗ = G \cup \{(a, x) : a \in B\}.$) This is a contradiction, since $(B, G)$ was assumed to be maximal.
Commentary. The part about the Zorn’s Lemma being applied to it is easy to understand since with the lemma 1.0 and 1.1 we have shown that $(\mathscr{A},\prec)$ is a nonempty ordered set such that every well-ordered subset of $ $ has an upper bound (this makes it in an "inductive" set) and Zorn’s Lemma says that every inductive set has a maximal element, in that case it's $ (B, G) $, I think I'm correct. But the remaining part in which they define $ x $ I don't understand, why they define it that way and why it contradicts the maximal element, someone explains that part to me.
Here's the idea in a nutshell.
We order subsets of $A$ with well-orders, and we extend them "on the top of each other". We then claim that this partial order satisfies the conditions of Zorn's lemma, then a maximal element must have as the domain of the order the entirety of $A$, since otherwise we can pick an element not ordered, put it "on top" of the order, and we have a strictly larger well-ordered subset of $A$. This is the usual maximality trick. The same as in the proof that every vector space has a basis, for example.
I think that a better way is to actually look at this proof from the eyes of transfinite recursion. We fix a choice function from subsets of $A$, and then we use to generate a well-ordering, one step at a time, by choosing an element from the part we haven't ordered yet. This way we build the order literally one element at a time, until we exhaust the entirety of $A$, in which case we got what we wanted: a well-ordering of the whole of $A$.