What is the condition for a guaranteed reduction in entropy when shrinking the sample space

Question

What is the condition for a guaranteed reduction in entropy when shrinking the sample space

32 Views Asked by Bumbble Comm At 29 Mar 2026 - 2:36

As an example, if we have a fair die with 6 faces with each face corresponds to number 1-6, its entropy is $H(X) = -\sum_{i=1}^{6} \frac{1}{6} \log_2\left(\frac{1}{6}\right) \approx 2.58 $. If we have a fair die with only 5 faces with each face corresponds to number 1-5, its entropy is $H(X') = -\sum_{i=1}^{5} \frac{1}{5} \log_2\left(\frac{1}{5}\right) \approx 2.32 $. In this case, reducing the size of sample space resulted in reduction in random variable's entropy. However, this is not true in general (think of an extremely unfair die with $P(1) = 0.995$, $P(2-6) = 0.001$, and then remove face 1). I wonder if there is some existing result in information theory that gives a condition on when this reduction in entropy will hold for discrete random variables. My intuition is, we need to place a constraint on the outcome with the highest probability. In particular, this probability shouldn't be larger than the outcome with the lowest probability by some factor that scales with the size of the sample space.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2024-01-19 23:23:21

Suppose WLOG that $X$ takes $n$ values in $1,2,\cdots n$ and let $Y$ be the resulting variable after removing value $n$. Let $p_n = P(X_n)=P(X=n)$.

Then $X$ can be considered as a non-overlapping mixture of two distributions , one with $n-1$ values (rv $Y$) and one with a single value. Then (see eg):

$$H(X) = h(p_n) + p_n \times 0 + (1-p_n) H(Y)=h(p_n) + (1-p_n) H(Y) \tag1$$

where $h()$ is the binary entropy function, and hence

$$H(X) > H(Y) \iff H(X) < \frac{h(p_n)}{p_n} =- \log(p_n) - \frac{1-p_n}{p_n}\, \log(1-p_n) \tag2$$

In particular, if $X$ is uniform, $p_n = 1/n$ and the inequality takes the form

$$ \log n < \log n - (n-1)\log(1-1/n) $$

which of course is true.

I'm not sure what more can be said of $(2)$ in general.

What is the condition for a guaranteed reduction in entropy when shrinking the sample space

There are 1 best solutions below

Related Questions in INFORMATION-THEORY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions