Why are binary representations of huge numbers about $3.3218$ times as long as their decimal representations?

5.4k Views Asked by At

Why are huge binary nubers about $3.3218$ times longer than their decimal counterpart?

I thought about this when I was writing this Python code:

huge_number = 21**31**3 # ** is the power operator
print((len(bin(huge_number)) - 2) / len(str(huge_number)))
# -2 is technical stuff ignore it

No matter what the $\texttt{huge_number}$ is (it has to be huge, this does NOT work for small numbers), you will get $3.3218$. Why?

4

There are 4 best solutions below

1
On BEST ANSWER

The number of digits of the representation of a positive integer $n$ in base $k$ is $$\ell_k(n) := \lfloor \log_k n \rfloor + 1,$$ and so the ratio of the length of a binary representation of a number to its decimal length is $$\frac{\ell_2(n)}{\ell_{10}(n)} = \frac{\lfloor \log_2 n \rfloor + 1}{\lfloor \log_{10} n \rfloor + 1}.$$

For large $n$, the constant terms in the numerator and denominator don't affect the ratio much, and neither do the differences between the values $\log_k n$ and their respective floors (which are always in $[0, 1)$), so (for large $n$) the ratio satisfies

$$\color{#df0000}{\boxed{\frac{\ell_2(n)}{\ell_{10}(n)} \approx \frac{\log_2 n}{\log_{10} n} = \log_2 10 = 3.32192\ldots}}.$$

A little more precisely, the definition of floor gives that $\log_k n \leq \lfloor \log_k n \rfloor + 1 \leq \log_k n + 1$, and so $$ \frac{\log_2 n}{\log_{10} n + 1} \leq \frac{\ell_2(n)}{\ell_{10}(n)} \leq \frac{\log_2 n + 1}{\log_{10} n} . $$ Using some straightforward algebra we can rewrite this as $$ \left(1 - \frac{1}{\log_{10} n + 1}\right) \log_2 10 \leq \frac{\ell_2(n)}{\ell_{10}(n)} \leq \left(1 + \frac{1}{\log_2 n} \right) \log_2 10 .$$ As $n \to +\infty$, both of the quantities in parentheses approach $1$, so the Squeeze Theorem lets us formalize your observation as the assertion $$\lim_{n \to \infty} \frac{\ell_2(n)}{\ell_{10}(n)} = \log_2 10 .$$

Plot of $\color{#7f0000}{\ell_2(n) / \ell_{10}(n)}$ for $1 \leq n < e^{2^6}$:

enter image description here

1
On

The number of digits is approximately(never off by more than 1) equal to the log in that base($\log_{10}(x)\approx$ the number of digits of x in base 10). Because of log math, you get:

$$\frac{\log_{10}(x)}{\log_2(x)}\approx 3.32193$$

0
On

An interesting, if inefficient way to calculate logs:

import string
import math
import matplotlib.pyplot as plt

huge_number = 21**31**3
b10len = len(str(huge_number))

NUMERALS = string.digits + string.lowercase
def baseN(num, b):
    digits = []
    while num:
        digits.append(NUMERALS[num % b])
        num = num // b
    return ''.join(reversed(digits))

bases = range(2, 30)
lengths = [len(baseN(huge_number, b)) for b in bases]

f, axs = plt.subplots(ncols=2)
axs[0].plot(bases, [b10len/l for l in lengths])
axs[1].plot(bases, [math.log10(x) for x in bases])

plt.show()

enter image description here

1
On

The fact that $$ 2^{10} = 1024 \approx 1000 = 10^3 $$ tells you that every ten binary digits need just over three decimal digits. Your question asks about exactly how many more than three (3.32...). The other answers (using logarithms, all good) identify the source of that number.