Here is the transformation: $$\begin{align*} &1\to(0)\\ &2\to(1)\\ &3\to(10)\\ &4\to((1))\\ &5\to(100)\\ &6\to(11)\\ &7\to(1000)\\ &8\to((10))\\ &9\to((1)0)\\ &10\to(101)\\ &11\to(10000)\\ &12\to(1(1))\\ &13\to(100000)\\ &14\to(1001)\\ &15\to(110)\\ &16\to(((1)))\\ &17\to(1000000)\\ &18\to((1)1)\\ &19\to(10000000)\\ \end{align*}$$ and so on ... The idea is to use $i_{th}$ position to represent the power of $i_{th}$ prime denominator. There is a recursive hierarchy whit parenthesis indicating beginning and end of each number. Since primes are building blocks of all numbers, this representation must have maximum information and minimum number of digits compared to all and every other binary representation.
- Can you prove me right or wrong?
- Is this being already used?
P.S. "Assuming that $\psi$ and $\phi$ are two uniquely decodable codes of natural numbers, $\psi$ is more compact than $\phi$ if the average number of digits that you use with $\psi$ to represent N natural numbers randomly chosen between 1 and M where (M>N) and N→∞ is less than that of $\phi$."
Take $N$ large. Then the number of integers in $[1,N]$ which have a prime factor greater than $\sqrt N$ is asymptotic to $N \log 2$ (see http://en.wikipedia.org/wiki/Dickman_function). These numbers all require at least $\pi(\sqrt{N}) \sim 2 \sqrt{N}/\log N$ bits. Therefore the average number of bits used will be asymptotically at least as high as $(2\log 2) \sqrt{N} / \log N$. This is grossly inefficient on average, compared to the $\log N / \log 2 + O(1)$ bits that the standard binary representation uses.
Update: actually, the contribution from the primes by themselves is already large enough to upset the average. The total number of bits consumed by the $\approx N/\log N$ primes is about $\tfrac12 (N/\log N)^2$, so the average bit usage coming from just the primes is asymptotic to $N/(2\log^2 N)$. It's unrealistic to expect such a wildly irregular encoding which has exponentially-bad worst case to beat the greedy binary encoding on average.