I'm now working my way through some basic set theory, and I came across the concept of an ordinal. In the book (and many other books), a transitive set is first defined as a set $x$ that satisfies
$\forall y \quad y\in x \implies y\subseteq x$.
And then an ordinal is defined as a transitive set whose elements are also transitive. Looking things up, I found out that the definition of ordinal is equivalent to a transitive set that is well-ordered by $\in$. From what I understand, ordinals are introduced to implement well-orders into set theory. I see why we would want to do this, and I see how the 'well-ordered by $\in$' part is natural.
But what is the 'transitive' part saying intuitively, and why is it interesting? Why do we need this assumption in the definition of an ordinal? And are there more interesting examples of sets that are transitive but not ordinals?
Sorry for the load of questions, this was a pretty strange definition for me.
Transitivity is probably best understood intuitively in terms of the concept of the transitive closure of a set. For any set $x,$ the transitive closure of $x,$ denoted $\operatorname{trcl}(x),$ is the smallest set that contains
Every element of $x$
Every element of an element of $x.$
Every element of an element of an element of $x.$
And so on...
Transitive sets are sets such that $x = \operatorname{trcl}(x).$
It's nice to consider the universe of sets as a directed graph, where the sets are nodes and there is an edge from $x$ to $y$ if $y\in x.$ Then the transitive closure of a set $x$ are all the sets $y$ that are accessible from $x$ in the graph.
In another sense, we can imagine the tranistive closure of $x$ as being all the sets that are "inside" $x,$ in a generalized sense, and then $x$ being transitive means all the sets "inside" are actually elements. In this sense, a transitive set $x$ "knows about" everything going on "inside" it. This may sound vague, but it's an important intuition for understanding absoluteness properties for transitive models.
So what about the ordinals? The big idea behind the ordinals is that every ordinal is the set of all lesser ordinals. So for instance the zero-th ordinal is $\emptyset,$ and the first is $\{\emptyset\}$ and the second is $\{\emptyset, \{\emptyset\}\}.$ This implies that the ordinals are linearly ordered by the membership relation $\in$, and thus the membership relation $\in$ is transitive on the class of ordinals, i.e. for any ordinals $\alpha,$ $\beta,$ $\gamma,$ if $\alpha\in\beta$ and $\beta\in \gamma,$ then $\alpha\in \gamma.$ And this (combined with the statement that every element of $\gamma$ is an ordinal), means exactly that $\gamma$ is transitive. Since $\gamma$ was arbitrary, we can conclude that every ordinal will be a transitive set, assuming the idea of "an ordinal is the set of lesser ordinals" is going to work (which it does).
As a side note, the definition "an ordinal is a transitive set, well-ordered by $\in$" is the better definition in some sense, since the definition "an ordinal is a transitive set of transitive sets" requires the axiom of foundation to work correctly. On the other hand, the second definition is less complex in a technical sense, so provided we're assuming foundation, it's useful.
There are other examples of transitive sets. Probably the most important are the sets in the Von-Neumann hierarchy $V_\alpha,$ which are defined recursively on the ordinals as $V_0=\emptyset,$ $V_{\alpha+1}=P(V_\alpha)$ and $V_\lambda = \bigcup_{\alpha<\lambda}V_\alpha$ for limit ordinals $\lambda.$ It's easy to see (or show by induction) that every $V_\alpha$ is transitive. There are several other natural hierarchies with transitive stages, like the constructible hierarchy $L_\alpha$ or the hierarchy $H_\kappa$ of sets of hereditary size $<\kappa.$ If you gleaned from the explanation above that transitivity is a nice set-theoretical closure property, then it should make sense that transitivity will come up when we are "organizing" sets.