Derivation of equation for self information

Question

Derivation of equation for self information

392 Views Asked by Bumbble Comm At 29 Mar 2026 - 2:02

I am trying to understand how the formula I(x) = -log(p(x)) for self information was derived.

From what I have read, 2 constraints were imposed on the properties that we would like self-information to satisfy. These constraints are listed below:

I(x) < I(y) if p(x) > p(y)
I(x and y) = I(x) + I(y) if P(x and y) = p(x).p(y)

Following this we somehow find out that I(x) = -log(p(x)) satisfies the above requirements.

My exact questions are:

Why did we define the requirements for self-information as above?
How did we arrive at I(x) = -log(p(x))?
How do we know that I(x) = log(p(x)) uniquely satisfies the above requirements?

Reference: http://people.seas.harvard.edu/~jones/cscie129/nu_lectures/lecture2/info%20theory/Info_Theory_1.html#def

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Why did we define the requirements for self-information as above?

Just because they seem reasonable, they fit with the concept we are trying to formalize. The first one says that an event with low probability give us high information. The second one says that the information a pair of two independent events is the sum of the individual informations.

Let's state our "axioms"

Information of the event should depend only on its probability: $I(X)=g(p(X))$
$g()$ should be non-negative, continuous and decreasing: lower probability means higher information
Specifically $g(0)=+\infty$ and $g(1)=0$ : very improbable event has very high information; (almost) sure event has (almost) zero information.
If events $X,Y$ are independent, then $I(X,Y)=I(X)+I(Y)$

It's easy to see that $g(p)= - \log_b (p)$ (arbitrary base) satisfies those requirements. To prove uniqueness is a little more difficult. Sketch: Suppose we have $n$ events with probability $p$, then the above implies (by induction) $g(p^n)=n g(p)$. By considering that $g(p)=g([p^{1/m}]^m)=m g(p^{1/m})$ we deduce that $g(p^a)=a \, g(a)$ holds also for any rational $a$ - and, by continuity for any real $a$. This leads to the unique (up to the base choice) solution $g(p)= - \log_b (p)$

Derivation of equation for self information

There are 1 best solutions below

Related Questions in INFORMATION-THEORY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions