Function from natural to lexicographically ordered words of a-z

Question

Function from natural to lexicographically ordered words of a-z

83 Views Asked by Bumbble Comm At 28 Mar 2026 - 10:35

I think I am looking for a very basic lexicographic ordering concept/function/algorithm, but I'm stumped. Basically, it has to produce the following mapping:

f(0) = a
f(1) = b
...
f(25) = z
f(26) = aa
f(27) = ab
...
f(??) = zz
f(??) = aaa

I think what makes this difficult is that a != aa, in the sense that the symbol a is 0 when it is alone, but when going higher it is not anymore. For example, 26 is 10 in base 26. To contrast, the function I am looking for would return "aa" in that case. So it kind of assigns a to 1 if it is in the most significant position, and a to 0 if it's in a less significant position, if that makes sense... I guess I got to this reasoning by trying to implement f as a conversion into base26, but that doesn't work out, since it starts skipping a's then (because, why would you write 00001, in that case you leave out the zeroes. Also, so when you get to z, and increment it, it increments to ab, and skips aa, because z = za in base26).

So I guess, to rephrase, I am looking for a bijection with type $\mathbb{N}\rightarrow\{a, \cdots,z\}+$, i.e., from the naturals to all words formed of symbols a to z. Is there such a function, and maybe even an efficient/simple implementation? It would also be great to have the inverse of that function, if it exists.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 03 Jul 2021 - 4:18

Here is a slightly different approach. We consider \begin{align*} &\mathrm{base }\ 10\quad\to\quad \{0,1,\ldots,9\}\\ &\mathrm{base }\ \ \;2\quad\to\quad \{0,1\}\\ &\mathrm{base }\ 26\quad\to\quad \{\mathrm{a},\mathrm{b},\ldots,\mathrm{z}\}\qquad \mathrm{???}\\ \end{align*} Regrettably taking base $26$ is not really promising cause we would have a mapping \begin{align*} &\mathrm{a}\ \quad \ \to 0\\ &\mathrm{b}\ \quad \ \to 1\\ &\quad\vdots\\ &\mathrm{z}\ \quad \ \to 25\\ &\mathrm{aa}\quad\to 0\cdot 26+0=0 \end{align*} which gives us leading zeros which are not wanted.

But we can overcome this drawback by introducing a dummy symbol $0$ and working with base $27$ instead of base $26$. This way we obtain \begin{align*} \color{blue}{\mathrm{base }\ 27\quad\to\quad \{0,\mathrm{a},\mathrm{b},\ldots,\mathrm{z}\}} \end{align*} We now have the mapping \begin{align*} &\mathrm{a}\ \ \quad \to 1_{10}\\ &\mathrm{b}\ \ \quad \to 2_{10}\\ &\quad\vdots\\ &\mathrm{z}\ \ \ \quad \to 26_{10}\\ &\mathrm{aa}\ \quad \to 1\cdot 27 + 1 = 28_{10}\\ &\mathrm{ab}\ \quad \to 1\cdot 27 +2 = 29_{10}\\ &\quad\vdots\\ &\mathrm{zz}\ \ \quad \to 26 \cdot 27 + 26 = 728_{10}\\ &\mathrm{aaa}\quad \to 1\cdot 27^2+1\cdot 27^1+1 = 757_{10} \end{align*}

Notes:

The mapping has some gaps, since we do not obtain base $27$ numbers which contain zeros. So, this is an injective mapping, not a bijective one. But, if this is not a problem ...
The algorithm looks rather simple. In base $10$ we have for instance \begin{align*} 387 = \left(3\cdot 10+8\right)10+7 \end{align*} and similarly we have in base $28$ \begin{align*} \mathrm{eac} \to \left(5\cdot 27 + 1\right)27 + 3 = 3\,675_{10} \end{align*}

**Bumbble Comm** · Accepted Answer

First we well find $f^{-1} (word) $ and then using that to find $f(num)$,note that $f^{-1}(aa)=26$ because $aa$ comes after all the one letter words and there are $26$ start counting from $0$ because $f(a)=0$ then we get that $f^{-1}(aa)=26$ and $ f^{-1} (aaa) = 702$ because it comes after all the one and two letter words and the two letter words are $26^2$ in total and start counting from $0$ we get that $f^{-1}(aaa) = 702$ and in general $f^{-1} (a^M = a\cdots a ) = \sum \limits_{i=1}^{M-1} 26^i = \frac{26^M-26}{25}$ and now we can generalize to any word simply be noticing that if we replace $m$ letters of $a$ with any other letter in any position we add $26^{pos} $ times $ value (letter)$ for instance $ f^{-1}(bc) = 26+ 1*26^1+2*26^0 = 54$ so the general formula is $ f^{-1}(word) = \frac{26^M-26}{25}+ \sum \limits_{i=0}^{M} word[i]*26^i$ so for example $f^{-1}(a b c b a)= \frac{26^5-26}{25}+ 0*26^0+1*26^1+2*26^2+1*26^3+0*26^4 = 494,208$

Now for $f$ itself, if we only have this sum $\sum \limits_{i=0}^{M} word[i]*26^i $ it would be easy to figure out the word from it assuming we know $M$, so for any value $num$ we find the $M$ such that $ \frac{26^M-26}{25} \leq num < \frac{26^{M+1}-26}{25}$ (using $\log$ is one way $M= \lfloor \frac{\log(25num+26)}{\log{26}} \rfloor$) then we get the value $ num-\frac{26^M-26}{25} = \sum \limits_{i=0}^{M} word[i]*26^i$ and then using $\mod 26$ method to find the word, for example $f(10000)$ => $M = \lfloor 3.8149\rfloor = 3$ so we are looking for three letters word, $10000-\frac{26^3-26}{25}= 9298 $ , Now $ 9298 = 357*26+16$ and $ 357 = 13*26+19$ and $ 13 = 0*26+13$ and so the word in values is $\{13,19,16 \} = \{ n,t, q\}$ and so $f(10000)= ntq$.

In computer algorithm all the above is easily implemented (except maybe how to find $M$ quickly since the approach i took will not work in a pc because of the floating point problem, i hope someone will give better way to find $M$).

Function from natural to lexicographically ordered words of a-z

There are 2 best solutions below

Related Questions in ELEMENTARY-NUMBER-THEORY

Related Questions in INVERSE-FUNCTION

Related Questions in REGULAR-LANGUAGE

Related Questions in FINITELY-GENERATED

Related Questions in REGULAR-EXPRESSIONS

Trending Questions

Popular # Hahtags

Popular Questions