Khinchin (1957) famously proved the uniqueness theorem for Shannon's entropy under 4 axioms. Let $\mathbf{p}=(p_{1},\cdots,p_{n})$ be a probability distribution. Let $X$ be an random element following $\mathbf{p}$. (I will use the notation $H(X)$ and $H(\mathbf{p})$ interchangeablly below.) The 4 axioms are
- $H(\mathbf{p})$ is continuous with respect to all its arguments.
- $H(\mathbf{p})$ is maximized at $\mathbf{p}$ being uniform ($p_{i}=1/n$ for all $i$).
- $H(p_{1},\cdots,p_{n},0)=H(p_{1},\cdots,p_{n})$.
- $H(X,Y)=H(Y)+\sum_{i=1}^{I} P(X=i)H(Y|X=i)$ where $P(X=i)$ is the marginal probability of $X$ takes the $i$th letter and $H(Y|X=i)$ is the entropy of the conditional distribution of $Y$ given $X$ takes the $i$th letter.
However I have seen, in some writings, the four axioms are re-stated with the first one being replaced by
- (1*.) $H(\mathbf{p})$ is symmetric with respect to all its arguments (label independence).
I wonder if replacing 1. by 1*. is a mistake or creates an equivalent axiomatic set.