Optimization of entropy for fixed distance to uniform

169 Views Asked by At

Suppose that I know that a probability distribution with $n$ outcomes is very close to being uniform (that is: $\forall i,p_i=\frac{1}{n}$), and in particular for $n\epsilon\ll 1$ the distribution verifies $$\sum_{i=1}^n|\frac{1}{n}-p_i|=\epsilon$$ Now consider the Shannon entropy function: $$H(p_1,\ldots,p_n)=-\sum_{i=1}^np_i\log_2p_i$$ My question is: what are the probability distributions that, verifying the constraint, maximize and minimize the entropy? My guess is that $(\underbrace{\frac{1}{n},\ldots,\frac{1}{n}}_{n-2\text{ times}},\frac{1}{n}+\frac{\epsilon}{2},\frac{1}{n}-\frac{\epsilon}{2})$ minimizes the entropy and $(\underbrace{\frac{1}{n}-\frac{\epsilon}{n},\ldots,\frac{1}{n}-\frac{\epsilon}{n}}_{n/2\text{ times}},\underbrace{\frac{1}{n}+\frac{\epsilon}{n},\ldots,\frac{1}{n}+\frac{\epsilon}{n}}_{n/2\text{ times}})$ maximizes the entropy but I have not been able to prove any of them.

1

There are 1 best solutions below

4
On BEST ANSWER

For the first, you have $H=-(n-2)\frac 1n \log_2 \frac 1n-(\frac 1n + \frac {\epsilon}2)\log_2 (\frac 1n + \frac {\epsilon}2)- (\frac 1n - \frac {\epsilon}2)\log_2 (\frac 1n - \frac {\epsilon}2) \\=-(n-2)\frac 1n \log_2 \frac 1n-(\frac 1n + \frac {\epsilon}2)[\log_2 \frac 1n +\log_2 (1+\frac {n\epsilon}2)]- (\frac 1n - \frac {\epsilon}2)[\log_2 \frac 1n - \log_2 (1-\frac {n\epsilon}2)] \\ \approx-\log_2\frac 1n-\frac {n\epsilon^2}{2 \ln 2}$

For the second you have $H=-\frac n2(\frac {1-\epsilon}n \log_2 \frac {1-\epsilon}n + \frac {1+\epsilon}n \log_2 \frac {1+\epsilon}n) \\=-\frac n2(\frac {1-\epsilon}n (\log_2 \frac 1n + \log_2(1-\epsilon)) + \frac {1+\epsilon}n (\log_2 \frac 1n + \log_2(1+\epsilon)) \\\approx-\log_2 \frac 1n-\frac {\epsilon^2}{2 \ln 2}$
So the second has lower change in entropy by a factor $n$