Theorem. Let $X$ and $Y$ be set, then $|X|\le |Y|$ or $|Y|\le|X|.$
Proof. We consider the family $$\mathcal{F}=\left\{(A,f)\;\middle |\; A\subset X,\;f\colon A\to Y\;\text{injective}\right\}.$$ We define a order relation on $\mathcal{F}$ in the follow way $$(A,f)\le(B,g)\quad\text{if and only if}\quad A\subseteq B\;\text{and}\;g_{|A}=f.$$
Question 1. Why does $A$ have to be a proper subset of $X$?
Let $$C=\{(A_i,f_i),\; i\in I\}$$ a chain in $\mathcal{F}$. This means that if $i,j\in I$, then $A_i\subseteq A_j$ $\big(f_{j|A_i}=f_i\big)$ or $A_j\subseteq A_i$ $\big(f_{i|A_i}=f_j\big).$
We observe that if $a\in A_i$ and $b\in A_j$, then $a, b\in A_j$ if $A_i\subseteq A_j$ or $a,b\in A_i$ if $A_j\subseteq A_i.$ We consider $$A:=\bigcup_{i\in I} A_i,$$ and we define $f\colon A\to Y$ such that $f(a):=f_i(a)$ if $a\in A_i.$ We observe that the application $f$ is well defined. In fact if $a\in A_i\cap A_j$, then $a\in A_i$ and $a\in A_j$, therefore $f(a):=f_i(a)$ and $f(a):=f_j(a)$, however we can suppose that $A_i\subseteq A_j$, then $$f(a):=f_{j}(a)=f_{j|A_i}(a)=f_i(a).$$ Moreover $f$ is injective: let $a,b\in A$ with $f(a)=f(b)$, then for what they said before they belong to the same $A_i$ for same $i\in I$, thus $$f_i(a)=f(a)=f(b)=f_i(b),$$ then $$f_i(a)=f_i(b),$$ therefore $a=b$ because $f_i$ is injective. We have proved that $(A,f)\in\mathcal{F}.$. Now, for definition $A$ is an upper bound of $\mathcal{F}$, then by Zorn's Lemma $\mathcal{F}$ admtis maximal elements. Let $(S,f)\in\mathcal{F}$ such that $A\ne X$ and $f(S)\ne Y$
Question 2 Why do we need to take $S\ne X$ and $f(S)\ne Y$?
We take $x_0\in X\setminus S$ and $y_0\in Y\setminus f(S)$, we can exstends $f$ to $S\cup \{x_0\}$ defining $f(x_0):=y_0.$ Therefore, if $(S,f)$ is maximal in $\mathcal{F}$, then $S=X$ or $f(S)=Y$.
Question 3. Why if $(S,f)$ is maximal in $\mathcal{F}$, then $S=X$ or $f(S)=Y?$
Thanks for patience!
For the first question, this is a notational issue. Since you do not cite the source of the proof I can't give you an exact answer. However, the only way this makes sense is if $\subset$ does not indicate proper subsets. Since this is something that some authors do, I'm inclined to believe this is the case here as well.
For the second and third questions, we don't need to take $A$ and $f(A)$ that way. We want to show that if $A$ is not $X$ and $f(A)$ is not $Y$, then $(A,f)$ is not a maximal element in the partial order. In particular, a maximal element—if it exists—must have either $A=X$ or $f(A)=Y$.
But finally, by Zorn's lemma, such maximal element does exist. So we are done.