Let $E$ be a set of size $n$. I define a probability measure on the (unordered) pairs, $p_{ij} = p_{ji}$ with $\sum_i \sum_{j\leq i} p_{ij} = 1$.
I want to compute a tweaked definition of entropy. Given a sequence of unordered pairs $\{i, j\}$ generated from $p_{ij}$, imagine the following game with players A and B:
- A chooses either $i$ or $j$ and encodes it. We note $z(i,j) \in \{i, j\}$ her choice (we admit that the optimal choice is deterministic).
- B has to retrieve the element encoded by A
The goal is for B to retrieve all elements perfectly while minimizing the expected size of the messages between A and B. Note $q_i$ the probability for A to output $i$,
$$q_i = \sum_{j | z(ij) = i} p_{ij}$$
The average number of bits per message is given by
$$\min_q H(q) = \min_{z,q} -\sum_i \sum_{j\leq i} p_{ij} \log q_{z(ij)} \geq \min_q -\sum_i \sum_{j\leq i} p_{ij} \log \max(q_i, q_j)$$
since $q_{z(ij)} \in \{q_i, q_j\}$.
By optimality, it is an equality and $z$ is a total order that can be retrieved only from the values of $q$.
In other words, there is a permutation $\sigma$ such that $q_{\sigma(1)} \leq q_{\sigma(2)} \leq \cdots \leq q_{\sigma(n)}$ and $q_{\sigma(i)} = \sum_{j\leq i} p_{\sigma(j)}$.
Now my problem is to find $\sigma$ in a computationally efficient manner. From the above properties, $\sigma$ minimizes the entropy of $q$.
Assuming we swap two numbers $i$ and $j$, the difference of entropy will be:
$$q_i \log q_i + q_j \log q_j - (q_i - p_{ij}) \log(q_i - p_{ij}) - (q_j + p_{ij}) \log (q_j + p_{ij})$$
By convexity of $x \rightarrow x \log x + (A-x) \log(A-x)$, the sign of this expression is the same as the sign of $\max(q_i, q_j) - \max(q_i - p_{ij}, q_j + p_{ij})$.
Does this define a total order? If yes, there is a $\mathcal{O}(n \log n)$ algorithm.
The answer is no, unfortunately.
Taking
the optimum is 1.00 with the permutation (0,1,2).
(1,2,0) gives 1.09 and (1,0,2) gives 1.11 so swapping 0 and 2 does not improve the score.