Why is the total number of possible symbol sequences not a suitable measure of information?

46 Views Asked by At

In Ralph Hartley's 1928 paper, "Transmission of Information", he outlines a quantification of information as, $H = log(s^n)$.

H = information

s = number of primary symbols

n = number of selections made

In the beginning of the paper, he gives the intuitive suggestion that the total number of possible sequences given $s$ primary symbols and $n$ selections, $s^n$ may be used as a measure of information. However, he then goes on to suggest that this measure is not suitable and that information should be related to $n$ by a constant that is proportion to $s$. I cannot convince myself the validity of this argument, and I suspect the reason may be a practical one related to some aspect of engineering.

His justification is on page 5 of this document, I'm hoping someone could provide a more accessible explanation. http://www.uni-leipzig.de/~biophy09/Biophysik-Vorlesung_2009-2010_DATA/QUELLEN/LIT/A/B/3/Hartley_1928_transmission_of_information.pdf

Thank you.

1

There are 1 best solutions below

5
On BEST ANSWER

Storing/transmitting $n$ symbols $x_1,\ldots x_n$ will clearly take half the memory/time as that of $2n$ symbols $x_1,\ldots,x_n,x_{n+1},\ldots,x_{2n}.$ So information should grow linearly with $n$ while the original measure $s^n$ would grow exponentially.

As for the proportionality constant, if we number the symbols from 1 to $n$ the number of digits we require is $\lceil \log_b s \rceil$ where $b$ is the base of the representation, $b=2,$ for base 2 etc. We just omit the ceiling for simplicity and ease of manipulation.