Using decimals of $\pi$ to store data

Question

Using decimals of $\pi$ to store data

1.2k Views Asked by Bumbble Comm At 15 May 2026 - 8:03

I read recently about an idea to, instead of storing actual data, converting the data to a string of digits and then store the index of where this pattern occurs in some number, for example $\pi$. The idea being that the index of the data would take up less storage space than the actual data.

Of course, we don't know whether $\pi$ is a normal number and hence we do not know if every finite decimal pattern occurs, but let's assume for the moment that it does (or one simply changes to some proven normal number, like the Copeland-Erdős constant).

The thing that struck me was whether the index of the data might actually be a larger number than the data itself. Does there exist some measure of the probability of finding a decimal sequence of length $n$ before the $m$:th decimal place? For $\pi$ in this case, I doubt there's a general formula. Would it depend on the base?

Information or references to other, similar ideas are also very welcome.

(Yes, I understand that this method is very impractical for everyday use, I just found the idea intriguinng.)

Original Q&A

There are 3 best solutions below

Bumbble Comm On 14 Jul 2014 - 8:59

The probability of the $n$th digit of a normal number (if it has essentially random digits) being a given digit is $1\over 10$. Therefore the probability of finding a sequence of $m$ digits starting with the nth digit is $1 \over 10^m$, so the probability of the sequence not having occurred after $n$ digits is $(1-{1\over 10^m})^n$. One way to guess at the digit at which you might find the sequence would be to find the point at which the probability of having found it was $\frac 1 2$, so $\log_{1-10^{-m}}(\frac 1 2)$, or $\ln(\frac 1 2)\over \ln(1-10^{-m})$. This turns out to be a number with just about the same number of digits as in the sequence that you are trying to store, and so saves no space.

Bumbble Comm On 14 Jul 2014 - 9:59

You don't need to know anything about $\pi$ to answer this one $-$ you just need to know that you can't get something for nothing.

To be more precise: any compression algorithm (which is what your scheme is), if it makes some inputs shorter, must also make some inputs longer. If an algorithm was able to compress, say, all 10000-bit inputs into less than 10000 bits, then the number of possible outputs ($2^{10000} - 1$) would be less than the number of possible inputs ($2^{10000}$) $-$ so you would inevitably have at least one pair of inputs compressing to the same output.

**Bumbble Comm** · Accepted Answer

Taking decimal base, assuming (big assumption) that the digits of $\pi$ are random, (uniform distribution, iid) , then given a number with $k$ decimal digits, the probability of finding it before some time $n$ is difficult to find in general (see eg here), and it might depend on the number itself.

A simplifying assumption would be to assume that all coincidence tries are independent (no overlapping) ; obviously a false assumption, but in many asympotics this is a fair approximation). We'd have then a geometric random variable with $p=1-10^{-k}$ (probability of success), and its expected value would be $\approx 10^{k}$. Which is the same order of the value of the number. Hence -under this very coarse approximation- the "index" is on average of the same magnitude as the number itself.

Using decimals of $\pi$ to store data

There are 3 best solutions below

Related Questions in NUMBER-THEORY

Related Questions in IRRATIONAL-NUMBERS

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions