Why is positional number system natural?

495 Views Asked by At

In the theory of computation, one mainly deals with maps $\Sigma^*\rightarrow\Sigma^*$. To discuss computation on other sets $X$ than $\Sigma^*$, one fixes a representation $\gamma:\Sigma^*\rightarrow X$ to interpret a word as an element of $X$. If $X=\mathbb N$, $\gamma$ can be (maps defined by) unary numeral system, positional number system base 2 or more, arithmetic circuits, or even description of algorithms that compute a natural number. Of course the standard one is positional number system base 2 or more (call this $\gamma_2$). Many interesting questions in the theory of computation, such as $\mathrm{PRIMES}\in \mathscr P$? and $\mathrm{KNAPSACK}\in\mathscr P$?, arise because of this particular representation.

However I do not find reason for favoring $\gamma_2$ other than that it is most used by human and electronic computers. One can argue that unary numerals are the most simple representation. If you want a concise representation, you could use arithmetic circuits or algorithms. I am beginning to doubt if it is an arbitrary decision to choose $\gamma_2$ as standard.

Thus my question is why is $\gamma_2$ is the most favorable, or natural way of representing natural numbers (if it is).

Of course, in an informal way I understand why $\gamma_2$ is favorable and natural. (You cannot claim that you found the $10^{500}$-th prime number by showing an algorithm that computes it.) My question is how to formalize this intuitive notion of naturalness and how to show $\gamma_2$ is the most natural (or at least as natural as other representations) according to it.

I believe someone should have considered this problem before. I am also looking for resources that deal with matter.

Thank you in advance.

3

There are 3 best solutions below

3
On BEST ANSWER

This is something that's recently made me curious, so forgive me for waxing philosophical: I also wonder if the choice of representation is somehow arbitrary, or whether maybe positional notation satisfies a kind of universal property with respect to some category of "tractable representations," or a similar concept.

Tractable Time Complexity of Combinatorial Operations

To me the ubiquity of positional notation seems to relate to the computational complexity of certain "natural" operations when using that representation scheme. As Timothy's answer indicates, these operations have to do with counting: succession, addition, multiplication, exponentiation, and so on (hyper-operations). In positional notation, the smallest of these operations are easily computable in polynomial time in the input size. Positional representations of natural numbers are also essentially power series, although this may beg the question: why are power series so ubiquitous? It may be the same answer.

I think the question of how to translate between different selected representations of natural numbers (or mathematical objects in general) gets to the core of what mathematics seems to be all about. For example, the problem of integer factorization can be viewed simply as a translation from one chosen representation of natural numbers into another, and the question of whether this translation can be computed in a tractable way under given constraints is of great interest, both practically and philosophically. Equations themselves are simply indications that one representation and another are references to the same object, and the use of an equation seems to be to translate fluently between one representation and another. It seems that different representations of the same object can make different properties of that object clearer or more computationally accessible, depending on the context--I don't think this is a contentious observation.

Maybe it will be constructive to compare positional notation with an alternative, "less natural" scheme, to see what some of the properties are that we might want to have in our "natural" scheme. The following recursive representation scheme can uniquely encode every positive natural number: Let $1$ denote one, let $(X,Y)$ denote the $X$th smallest prime number raised to the power of $Y$, and otherwise let concatenation indicate multiplication. The first few natural numbers are as follows:

$$1$$ $$(1,1)$$ $$((1,1),1)$$ $$(1,(1,1))$$ $$(((1,1),1),1)$$ $$(1,1)((1,1),1)$$ $$((1,(1,1)),1)$$

This "nested" representation scheme makes it easy to identify when two represented numbers are equivalent to each other, and whether a represented number is prime or composite, or a perfect power, but most of the combinatorial operations are not easy at all. It's still possible to write down some arbitrary number in this notation, or to construct a random number using a certain number of parenthetical pairs, just as it is possible to do so in binary using a certain number of symbols. For example, $$(((1,1),1),((1,1),1)(((1,1),1),1))$$ represents five raised to the power of fifteen, which in binary is $$11100011010111111010100100110001101$$ The binary representation requires thirty-five symbols, while the alternative scheme requires thirty-four. The asymptotic order of the length of nested representation of the natural number $n$ isn't clear to me, my guess is that it is at most $O(\log{n})$--maybe it's $o(\log{n})$, either would be interesting.

I found your statement that you cannot claim to find the $10^{500}$th prime by simply showing an algorithm that computes it interesting, and somewhat antithetical to your consideration. Why is a binary string, which itself is in a sense a program to identify a natural number, an acceptable algorithm to use to present this hypothetical prime, but something such as, for example, $$((1,(1,(1,1))((((1,1),1),1),((1,1),1))((1,1),1)))((((1,1),1),1),(1,(1,1))((((1,1),1),1),((1,1),1))((1,1),1))),1)$$ which is a unique encoding of the $10^{500}$th prime in a fixed universal scheme of representation, is unacceptable, and perhaps even less acceptable than simply stating "the $10^{500}$th prime"? I think the answer is that it doesn't tell us the relative place of this number---in this scheme, there doesn't seem to be a tractable way to compare the relative sizes of two numbers, i.e. to determine which one is lesser and which one is greater, which is probably what we actually care about when considering the identity of a natural number. Positional representation schemes, on the other hand, accomplish this task almost immediately.

Maybe more to the point, I think that given representations of $X$ and $Y$, we probably want to have tractable procedures to do the following things:

  1. Decide whether $X$ is greater than $Y$,
  2. Decide whether there is some $Z$ intermediate to $X$ and $Y$,
  3. Produce the representation of some $Z$ intermediate to $X$ and $Y$, if one exists.

If we restrict to the toy example representation scheme, it seems that we can perform these three tasks easily only in special cases--in general, they are computationally expensive. In contrast, they are extremely easy to compute in a positional scheme.

Highly Tractable Space Complexity for Multiplication

Regarding the comparison between unary and positional notations with larger radixes, I think that the main consideration has to do with space complexity, especially for multiplication (but also for addition). If natural numbers $n$ and $k$ are represented in unary as strings $U(n)$ and $U(k)$, then the space complexity of multiplication of $n$ and $k$ is at best $O(|U(n)|\cdot |U(k)|)$, simply because we must actually represent the product. If instead we have natural numbers $n'$ and $k'$ represented in binary as strings $B(n')$ and $B(k')$, then the space complexity of multiplication of those two numbers is $O(|B(n')|+|B(k')|)$. This is a tremendous improvement in space complexity even comparing representations of the same lengths, and we gain the additional benefit that larger numbers can be represented more compactly in the first place. At the same time I believe we also find a more modest improvement in time complexity--the best known time complexity to multiply two $n$-digit numbers given in positional notation is $O(n\log(n))$ (published by D. Harvey and J. van der Hoeven in 2019), whereas the time complexity of unary multiplication is bounded below by its space complexity. So all around unary may reasonably be seen as a natural or primitive representation scheme, but it is computationally inefficient in both space and time compared with larger radixes.

Increasing the radix beyond two does not yield a change in the "big O" complexity of multiplication as a function of input length--indeed, it is impossible to perform any operation in a space of order less than the sum of the lengths of the input representations, since that space is required to store the inputs--but increasing the radix does compactify the inputs more before the computation begins. Log-space transduction also establishes positional multiplication as belonging to $L$ (log-space). In this sense positional notation accomplishes a nearly optimal order of space complexity for multiplication as a function of the length of the input representations, and the choice of where to stop increasing the radix, given that a positional scheme is adopted, seems (to me) to be at least cultural and historical if not also having some small bit to do with more fundamental aspects of human cognition.

Additional Notes

The decision problems (1) and (2) above together constitute the determination of whether $X$ is the successor of $Y$. Suppose we have a tractable (ex: polynomial time) algorithm to decide whether two representations are equivalent, and also to compute the representation of the successor of $X$ given the representation of $X$. Then we certainly have a tractable algorithm to decide problems (1) and (2): first check to see if $X$ and $Y$ are equivalent, and if not, just iteratively construct the successors of $X$ and $Y$ until one of the successors is equivalent to $X$ or $Y$. This also enables solving (3).

Furthermore, a tractable successor algorithm along with a tractable equivalence decider enables the construction of a tractable addition algorithm, and then multiplication, and so on. A question that now comes to mind is whether the reverse is true, i.e. whether (1), (2), and (3) all together imply both a tractable equivalence decider and a tractable succession constructor. Actually, the equivalence decider is guaranteed, since if neither of $X$ or $Y$ is greater than the other, and if there is no $Z$ intermediate to $X$ and $Y$, then $X$ and $Y$ must be equivalent.

But now, it becomes clear that with (1), (2), and (3), we can almost produce a tractable succession constructor. To find the successor of $X$ with these procedures, though, it appears that we may need to be able to construct some $Y$ that is larger (and not too much larger) than $X$. Then we can proceed as follows:

  1. If $Y$ is larger than $X$ but no intermediary exists, then $Y$ is the successor of $X$;
  2. Otherwise, there is some intermediary $Z$, which we can construct using (3). Repeat from the previous step using this new $Z$ instead of $Y$ (and so on) until the successor of $X$ is found.
0
On

This is a very interesting question. I know of the approach of Shapiro who in the paper "Acceptable notation" wondered about which notations capture all computable functions. But your question goes further. Among acceptable notations, why binary is at least as good as any other? Obviously, this requires to define what we mean by "good" in this context. My wild guess would be that it might have something to do with the economy of notation. Still, this is vague. But basically, unary notation is not as economic as binary. Maybe one can define a measure of how concise a notation is for naturals. Another thing is that this might have something to do with the economy of computation (under a given notation), since notations affect the space and/or time complexity of functions. This is not much, but I hope it might help a bit.

0
On

Why binary?

With $0$ characters in our alphabet, we can't have nonempty strings, or represent more than one object; with $1$ character, strings of length $\le n$ can represent only $O(n)$ objects; with at least $2$ characters, they can represent exponentially many objects; further increasing the number of characters only reduces the $O(\log n)$ maximum string length needed to represent $n$ objects by a $O(1)$ factor, so we may as well use $2$ characters.

Why $\gamma_2$?

We can then represent the integers $0$ to $2^n-1$ with $n$-bit strings in one natural way, $\gamma_2$, or some variants. Either we globally exchange the symbols used, or we exchange them for some but not all positions. The former is isomorphic to $\gamma_2$; the latter is less natural.

A motive for both

Similarly, with $b\ge2$ characters we can represent the integers $0$ to $b^n-1$ with $n$-character strings in a natural way analogous to $\gamma_2$, or isomorphic or unnatural variants. But if $b=2$ this identifies the set of powers of $2$ summed over, but for $b>2$ coefficients are needed too. (You can also think of this in terms of multisets.)