Is this an entropy identity?

126 Views Asked by At

For $\mathcal{X}$ finite, take $\mathcal{X}$-valued random variables $x_1,\dots, x_n$ with some joint distribution and a function $f:\mathcal{X}^n\to \mathbb{N}^n$. Also choose $I$ uniformly randomly from $1$ to $n$. Is it true that $$\frac{1}{n}H(f(x_1,\dots,x_n))= H(f(x_1,\dots, x_n)| I, (x_t)_{t\neq I}) ?$$

2

There are 2 best solutions below

5
On BEST ANSWER

By definition we have \begin{equation} H(f(X_1,...,X_n)| I , X_1,..., X_{I-1},X_{I+1},...,X_n) \end{equation} \begin{equation} = \sum_{i \in [n]} \frac{1}{n} H(f(X_1,...,X_n) | X_1,..., X_{i-1},X_{i+1},...,X_n) \end{equation} and therefore the identity is true iff \begin{equation} H(f(X_1,...,X_n)) = \sum_{i \in [n]} H(f(X_1,...,X_n) | X_1,..., X_{i-1},X_{i+1},...,X_n) \end{equation} which is very unlikely to be true.

Dependent Counter-example

Consider the case $f = \text{id}$ then this is equivalent to \begin{equation} \sum_{i \in [n]}H(X_i |X_1,...,X_{i-1}) = \sum_{i \in [n]} H(X_i) \end{equation} and if the $X_i$ are not independent then \begin{equation} \sum_{i \in [n]}H(X_i |X_1,...,X_{i-1}) < \sum_{i \in [n]} H(X_i), \end{equation} because they will have non-zero mutual information.

Independent Counter-example

The counterexample when the $X_i$ are independent is a bit less trivial. Suppose the $X_i$ are i.i.d. uniformly random binary (i.e. nice and fair coin flips) and let $f(X_1,...,X_n) = X_1+...+X_n$ then we have that \begin{equation} H(f(X_1,...,X_n)) = - \sum_{k \in [n]} \binom{n}{k} 2^{-n} \log \left(\binom{n}{k} 2^{-n} \right) \end{equation} and \begin{equation} \sum_{i \in [n]} H(f(X_1,...,X_n) | X_1,..., X_{i-1},X_{i+1},...,X_n) \end{equation} \begin{equation} = \sum_{i \in [n]} H(X_1+...+X_n | X_1,..., X_{i-1},X_{i+1},...,X_n) = \sum_{i \in [n]} H(X_i) = n \end{equation} which are not equal since \begin{equation} - \sum_{k \in [n]} \binom{n}{k} 2^{-n} \log \left(\binom{n}{k} 2^{-n} \right) = \mathcal{O} \left( \log n \right) \end{equation} since it takes $\mathcal{O} \left( \log n \right)$ bits to describe the range $X_1+...+X_n \in \left[1,2,...,\binom{n+1}{2}\right] $.

3
On

Unless I'm misunderstanding the setting, here is an obvious counterexample:

Let $f$ be the identity function. Since there are no restrictions on the joint probability distribution, let $P(x_1 = x_2 = ... = x_n) = 1$ So, for all outcomes where at least one of the random variables attains a different value the probability is zero.

Then, even conditioning on one of the random variables yields a zero entropy because the vector becomes deterministic. So, for all such joint distributions where there is more than one outcome with positive probability,

$H(x_1,...,x_n | I, \{x_i\}_{i \neq I}) = 0 \neq \frac{1}{n} H(x_1,...,x_n)$