I'm actually interested in the derivation of the information measure $I(p)=-\log p$ of a single event of probability $p$, as the Shannon entropy can then be defined as the average over all events. Strangely enough, I could only find two axiomatic derivations of $I$ itself, of which the one mentioned here is the less restrictive:
- $I(p)\ge 0$ and $I(1)=0$;
- $I(p_1 p_2)=I(p_1)+I(p_2)$;
- $I$ is continuous.
My question is whether condition (3) could be replaced with the condition that $I$ be monotonically decreasing. More generally, I'm looking for a simple derivation that makes use of both condition (1) and the monotonicity argument (less probable events should be more informative).
Edited out proof attempt as per comments.