On what page did Shannon axiomatize self-information?

137 Views Asked by At

I am trying to understand who first derived the formula I(x) = -log(p(x)) for self-information from axioms. The wiki entry on information content and several other online sources all say it is in Shannon's well known article ``A mathematical theory of communication'' Bell Journal July 1948. However, I have looked closely at this article and can only find an axiomatization of Shannon entropy. I cannot find an axiomatization of self-information. Can anyone please tell me the page number, equation number, or section number where this supposed axiomatization of self-information is hiding? I have found references after Shannon 1948 eg Luce 1960 with the axiomatization of self-information I am seeking.

1

There are 1 best solutions below

0
On

Most of the time, people cited Shannon's work to appreciate his contributions, not necessarily the case that all results are derived by Shannon or presented in the same way as Shannon did. As you already see from Shannon's original paper, he considered uncertainty for a random variable (or information conveyed in a random variable) and directly obtained the entropy function.

As I know, the term "self-information" first appears in Prof. Fano's textbook (and later in Prof. Gallager's IT textbook). However, they all began with a more general quantity $i(E_1; E_2)$ called the "mutual information between events $E_1$ and $E_2$." The self-information $h(E)$ of an event $E$ is then defined as $h(E)=i(E;E)$. In fact, when taking expectation for $i(E;E)=h(E)$ with respect to all elementary events $E$ associated with a random variable $X$, we obtain this known formula $I(X;X)=H(X)$, where $I(\cdot)$ and $H(\cdot)$ are the usual mutual information and entropy, respectively.

In Prof. Fano's approach, you may find a different set of axioms to obtain a mathematical expression for the mutual information between two events. From that expression, an expression for the self-information is immediate. However, I personally like to use the four axioms as shown on the Wiki page to define self-information since it is more straightforward. I hope the above helps.