Suppose for some parameter $d$, we choose a string from the Hamming cube ($\{0,1\}^d$) by setting each bit to be $0$ with probability $p$ and $1$ with probability $1-p$. What is the entropy of this distribution on the Hamming cube? Clearly, if $p=\frac{1}{2}$, then the entropy would just be $\log \left(\frac{1}{2^d} \right) = d$. If $p = 0$, then the entropy would be $0$. What would be the entropy for the general case in terms of $p$? It would clearly be less than $d$....
2026-03-30 12:04:10.1774872250
On
Entropy of a distribution over strings
152 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
2
On
In the Hamming $d$-cube there are $\binom{d}k$ points with $k$ zeroes and $d-k$ ones, so the entropy is
$$-\sum_{k=0}^d\binom{d}kp^k(1-p)^k\lg\left(p^k(1-p)^{d-k}\right)\;,$$
or
$$-\sum_{k=0}^d\binom{d}kp^k(1-p)^{d-k}\Big(k\lg p+(d-k)\lg(1-p)\Big)\;.$$
Now
$$\begin{align*} \sum_{k=0}^d\binom{d}kkp^k(1-p)^{d-k}&=d\sum_{k=0}^d\binom{d-1}{k-1}p^k(1-p)^{d-k}\\\\ &=dp\sum_{k=0}^{d-1}\binom{d-1}kp^k(1-p)^{d-k-1}\\\\ &=dp\;, \end{align*}$$
and
$$\sum_{k=0}^d\binom{d}kdp^k(1-p)^{d-k}=d\;,$$
so
$$-\sum_{k=0}^d\binom{d}kp^k(1-p)^k\lg\left(p^k(1-p)^{d-k}\right)=-d\Big(p\lg p+(1-p)\lg(1-p)\Big)\;.$$
The probability that we select a specific string with $k$ one bits is $$p^k(1-p)^{d-k}$$ Thus $$\Bbb{E}[-\log_2 P] = -\sum_{i\in\{0,1\}^d}P(i)\log_2P(i) \\= -\sum_{k=0}^{d}\binom{d}{k}p^k(1-p)^{d-k}\log_2(p^k(1-p)^{d-k})\\ =-\sum_{k=0}^d\binom{d}{k}p^k(1-p)^{d-k}[k\log_2(p)+(d-k)\log_2(1-p)]\\ = -\log(p) \Bbb{E}[H] - \log(1-p)\Bbb{E}[T]$$
Where $H\sim \operatorname{Binom}(d,p)$ and $T\sim\operatorname{Binom}(d,1-p)$. These respectively have means $dp$ and $d(1-p)$. The resulting entropy is therefore $$-d(p\log_2(p)+(1-p)\log_2(1-p))$$