I think most proofs of Zorn's Lemma (specifically that it follows from the Axiom of Choice) fall into two categories, one using ordinals and the other, more complicated, avoiding ordinals (but introducing towers (or closed sets) and such).
I was wondering if there were any specific links between the two proofs, especially linking elements of the more complicated proof to equivalent steps in the simpler proof using ordinals, since maybe that could help intuition.
Those are not the only kinds of proofs. You can prove it from similar statements such as the Teichmüller–Tukey Lemma or the Hausdorff Maximality Principle, and those proofs are far more direct in nature.
You can also prove it from any other statement from the incredibly diverse collection of equivalents of Zorn's lemma (and the axiom of choice). Things such as "every vector space has a basis" or "every non-empty set has a group structure" etc.
But it is true that the classic equivalence of Zorn's lemma with the axiom of choice and the well-ordering theorem is what most people learn. And in a lot of cases you can prove these equivalences in any direction you want without paying too much in complexity.
The things about the axiom of choice is that on its own it is utterly useless. And people tend to forget that when they want a proof via the axiom of choice "and nothing else". The axiom of choice to proofs is like a knife to a steak. It is a tool (and an important one), it is not the main ingredient.
Using the axiom of choice generally gets coupled with transfinite recursion. And you can dress that using ordinals, or by creating awful constructions (à la Halmos) just to avoid "the dreaded transfinite ordinals". But ultimately, it is what it is.
The axiom of choice is useless without something to drive it forward. And you need to have a way to construct larger and larger chains, and use the axiom of choice to uniformly choose upper bounds and ensure the construction can be continued.
Now, with the well-ordering theorem, you just put the axiom of choice into the proof, because from a well-ordering you can choose the least possible candidate if it exists. So it is really just the same thing, only here we have a structure that provides us with a uniform choice at each step. In this case, by the way, we are saving another step in the proof: we do not need to go via Hartogs' theorem to argue the recursion stops.
In any case, the intuition behind all of these proofs is the same. You try to approximate a maximal element. How? By constructing larger and larger chains. You hope that at some point you'd exhaust all the possible ways to extend your chain, which will then provide you with a chain that either has no upper bound, or has a maximal element (in the partial order).