Suppose that $N = \mbox{factorial}(9999999999)$. The number $N$ is mindbogglingly huge and, yet, can be represented very neatly and compactly as $\mbox{factorial}(9999999999)$. I have two questions:
How much "information" is "factorial()"?
Approximately how much compression did we achieve here?
I thought I could perhaps write and compile a program to get the factorial of a number $N$ and compare the (size of the executable + size of $9999999999$) with the number of bits used to store $N$. But, this would vary with the language used and there's also stuff in the executable other than the factorial definition. So, how do I quantify the amount of information that something like factorial(x) comprises?
PS: Please excuse the dumb question. I just find it fascinating that such large numbers can have such compact representations.
The notion you’re looking for is called Kolmogorov complexity.
The idea is that you consider some collection of expressions which evaluate to at most one value (but which could potentially not evaluate to a value - for example, $1/0$). For example, we may allow expressions to be formed using base-10 number notation, addition, multiplication, subtraction, division, and factorial. So $”9999999999!”$ would be an expression in the language which would evaluate to the number $9999999999!$.
We define the complexity of a number to be the length of the shortest expression which evaluates to that number. It seems very likely (though I haven’t proved it yet) that the complexity of $9999999999!$ would be 11 relative to our language. We could compare this to the actual number of digits in this number, which is approximately $10^{12}$, to see how much compression we have achieved.
Note that the most general case we would consider for a language of expressions would be a programming language like Python, Haskell, Java, etc. (or at least an idealised version of these languages without IO or randomness). With a programming language, we could define factorial within the language and use it, so we would be able to put a number on how much information it takes to encode $factorial$ itself. The downside to using a programming language for our expressions is that there is no algorithm to determine whether an expression actually represents a value or will run into an infinite loop when evaluated.
Keep in mind that compression will always depend on the specifics of the data layout. If I wanted to, I could come up with a new data format consisting of base-10 number notation and the symbol $n$, where $n$ is meant to denote $9999999999!$. Then, I could represent the number in question with only a single character (by design). But this would be a pretty stupid encoding system.