How to define the entropy of a list of numbers?

3.2k Views Asked by At

Considering a list of numbers $\{a_1,a_2,...,a_n\}$, after sorting the $n$ numbers in increasing order, how much the entropy changes?

Updated

Or we can understand the problem by using the number of bits to describe the list: For a sorted list in increasing order which is $\{a_1^', a_2^',...,a_n^'\}$, we can convert the list to d-gap list $\{a_1^', a_2^'-a_1^',...,a_n^'-a_{n-1}^'\}$, if we use fix-length code, we need $n \log (\max_i (a_i^'-a_{i-1}^'+1)$) bits, while if a list is in random order, we need $n \log (\max_i a_i+1)$ bits to describe it, which is more than that of d-gap list.

2

There are 2 best solutions below

4
On BEST ANSWER

I would argue that under any "natural" formulation of this problem (e.g., the numbers are drawn independently from the same probability distribution), the entropy should decrease by $\lg n!$.

2
On

This should be a coment to Craig's answer. I agree that a decrease by $\log(n!)$ seems the right answer, almost obvious from the "information" point of view - but we don't have a proof yet.

Assuming we have $n$ iid random variables $x_i$, we know the joint entropy of $(x_1, x_2 \cdots x_n)$ is $n H(x)$. The question amounts to compute the joint entropy of the ordered variables $(y_1, y_2 \cdots y_n)$ , where $y_1 =x_{(1)}$ is the first order statistic , etc. This is in principle doable, but I don't see how a general result should be obtained.

To put an example: assume $x_i$ is uniform in $[0,M]$ (continuous variables are more tricky in regards to entropy interpretations, but here the math is a little simpler, and as long as we dont change scales, we are allowed to compare entropies; anyway, this also could be done with discrete variables). We have a join (differential) entropy of $H_0 = n \log(M)$ The ordered variable $y$ can equivalently be specifed by the differences $(z_1,z_2 ... z_n)$ where $z_i=y_i-y_{i-1}$. We know that, for large $n$, $z_i$ are asympotically independent, and exponential with mean $M/n$; the joint entropy then tends to $H_1 = n [1-\log(n/M)] \approx - n \log(n) + n \log(M)$. Then, the decrease of entropy that results from ordering is about $n \log(n) \approx \log(n!)$