Equation to Specify Lexicographical Ordering?

64 Views Asked by At

I am not a mathematician, but I need to specify, precisely, a special way to sort an arbitrary list of ASCII strings1, with the addition of some special rules for a small set of specific characters. I believe this is called "lexicographical ordering" or "collation" rules.

For an extremely simple example, let's say I want $\text{A} < \text{B} < \text{C} < \text{F} < \text{E} < \text{D}$. For that, I am imagining something like the following:

$ \newcommand{\coloneqq}{\mathrel{\mathop:}\mathrel{\mkern-1.2mu}=} C \coloneqq \{ \text{A}, \text{B}, \text{C}, \text{D}, \text{E}, \text{F}, \text{G} \} \\ f(c) \colon C \to \mathbb{N} \\ O \coloneqq \{ (\text{A}, 1), (\text{B}, 2), (\text{C}, 3), (\text{F}, 4), (\text{E}, 5), (\text{D}, 6) \} \\ f_c \coloneqq \{ i \in \mathbb{N} \mid \exists ((c, i) \in O)\} $

I think this is incorrect because I end up with the function returning a set instead of a number I can use for comparison. (E.g., to test if $f(\text{F}) < f(\text{C})$.)

How can I do this properly? What are some keywords I could search for to read more about this?


1 Eventually, I would like to override the Unicode® collation algorithm, but let's take that out of scope to keep this simple. I was hoping that algorithm might contain the answer I need, but it is very complex, and I see no sign of a general equation.