The characteristic of a ring (with unity, say) is the smallest positive number $n$ such that $$\underbrace{1 + 1 + \cdots + 1}_{n \text{ times}} = 0,$$ provided such an $n$ exists. Otherwise, we define it to be $0$.
But why characteristic zero? Why do we not define it to be $\infty$ instead? Under this alternative definition, the characteristic of a ring is simply the “order” of the additive cyclic group generated by the unit element $1$.
My feeling is that there is a precise and convincing explanation for the common convention, but none comes to mind. I couldn't find the answer in the Wikipedia article either.
There are two orderings of the set $\mathbb N = \{0,1,\dots\}$:
They are mostly compatible - usually when $a \mid b$, it holds $a \leq b$.
Some definitions are phrased using "greater than" ordering, while in fact the "divisibility" ordering is the real essence.
For example, the greatest common divisor of $a$ and $b$ might be defined as the greatest number which is a common divisor of both $a$ and $b$. Characteristic of a ring $R$ might be defined as smallest number $n>0$ which satisfies $n \cdot 1 = 0$.
Under such commonly taught definitions, it seems natural that $\operatorname{gcd}(0,0)=\infty$ and $\operatorname{char} \mathbb Z = \infty$.
However, those definitions implicitly rely on ideals, and are better phrased using divisibility order. The incompatibility is then more visible: $0$ is the largest element in divisibility order, while it is smallest in magnitude order. Magnitude has no largest element, and often $\infty$ is added to cover this case.
So let's formulate the definitions again, but this time using divisibility ordering.
Characteristic is a "multiplicative" notion, like gcd. If you have a homomorphism of rings $f: A \to B$, it must hold $\operatorname{char} B \mid \operatorname{char} A$. For example, you cannot map ${\mathbb Z}_2$ to ${\mathbb Z}_4$ - in a sense, ${\mathbb Z}_2$ is "smaller" than ${\mathbb Z}_4$. "Bigger" rings have "more divisible" characteristic, their characteristics are greater in the sense of divisibility. And the "most divisible" number is 0. Another example is $\operatorname{char} A \times B = \operatorname{lcm}(\operatorname{char} A, \operatorname{char} B)$.
In a bit more abstract language: given any ideal $I \subseteq \mathbb Z$, we associate to it the smallest nonnegative element, under the divisibility order. By properties of $\mathbb Z$, every other element of $I$ is a multiple of it. Let's call this number $\operatorname{min}(I)$.
We can now define $\operatorname{gcd}(a,b)=\operatorname{min} ((a) + (b))$, and $\operatorname{char} R = \min (\ker f)$, where $f \colon \mathbb Z \to R$ is the canonical map.
The definition of $\operatorname{min}(I)$ works for any PID, it does not require magnitude order. In any PID, $I = (\operatorname{min}(I))$.
(I dislike saying the ideal $\{0\}$ is "generated" by $0$; although this is true, it also generated by empty set. We do not say that $(2)$ is generated by $0$ and $2$.)