Entropy of convolution of measures

1.9k Views Asked by At

Let $G$ be a countable, discrete group, and let $\mu_1,\mu_2$ be probability measures on the group $G$. We define the entropy of $\mu_i$ as

$H(\mu_i)=\sum\limits_{g \in G}-\mu_i(g)\log(\mu_i(g))$ (with the convention that $0 \cdot \log0=0$).

Recall that convolution of $\mu_1$ and $\mu_2$ is defined as:

$(\mu_1 \star \mu_2)(g)=\sum\limits_{h \in G}\mu_1(gh^{-1})\mu_2(h)$.

Now, it's a fact that $H(\mu_1 \star \mu_2) \leq H(\mu_1)+H(\mu_2)$. The usual proof goes by showing that entropy decreases under factor maps.

My question is: Is there a completely hands on proof of this fact- preferably using the concavity of $\log x$ or convexity of $-x\log x$?

3

There are 3 best solutions below

4
On BEST ANSWER

With no convexity... If $P$ and $Q$ are probability measures and $R=P\ast Q$, then $$H(R)=-\sum_xR(x)\log R(x)=-\sum_{x,y}P(y)Q(x-y)\log R(x).$$ For every $(x,y)$, $R(x)\geqslant P(y)Q(x-y)$, hence $$H(R)\leqslant-\sum_{x,y}P(y)Q(x-y)\log(P(y)Q(x-y))=-\sum_{x,y}P(y)Q(x)\log(P(y)Q(x)),$$ where the last identity holds because, for every $y$, the mapping $x\mapsto x-y$ is a bijection from $G$ to $G$. Thus, $$H(R)\leqslant-\sum_{y}P(y)\log(P(y))\left(\sum_xQ(x)\right)-\sum_{x}Q(x)\log(Q(x))\left(\sum_yP(y)\right).$$ Finally, $P$ and $Q$ are probability measures hence the two sums between parentheses are equal to $1$ and we are left with the inequality $$H(R)\leqslant-\sum_{y}P(y)\log(P(y))-\sum_{x}Q(x)\log(Q(x))=H(P)+H(Q).$$

1
On

You can use the convexity of $\phi(x) = -x\log x$ to show this inequality directly. First

$$\phi\left(\sum_i x_i\right) \leq \sum_i \phi(x_i)$$

We have

$$H(\mu_1 \star \mu_2) = \sum_{g\in G} \phi((\mu_1 \star \mu_2)(g)) \\= \sum_{g\in G} \phi\left(\sum_{h \in G}\mu_1(gh^{-1})\mu_2(h)\right)$$

$$\leq \sum_{g,h\in G} \phi\left(\mu_1(gh^{-1})\mu_2(h)\right) = S$$

Here we need to calculate the sum S explicitly

$$S = \sum_{g,h\in G} -\mu_1(gh^{-1})\mu_2(h)\log(\mu_1(gh^{-1})\mu_2(h))$$ $$= \sum_{g,h\in G} -\mu_1(gh^{-1})\mu_2(h)\log(\mu_1(gh^{-1}))+\sum_{g,h\in G}-\mu_1(gh^{-1})\mu_2(h)\log(\mu_2(h))$$

In the summation we can let $g = g'h$ and sum over $g'$ instead

$$S = \sum_{g',h\in G} -\mu_1(g')\mu_2(h)\log(\mu_1(g'))+\sum_{g',h\in G}-\mu_1(g')\mu_2(h)\log(\mu_2(h))$$

Summing over h first in the first sum and over g first in the second one we have

$$S =\sum_{h\in G}\mu_2(h)\sum_{g'\in G} -\mu_1(g')\log(\mu_1(g'))+\sum_{g'\in G}\mu_1(g')\sum_{h\in G}-\mu_2(h)\log(\mu_2(h))$$ $$ =\sum_{g'\in G} -\mu_1(g')\log(\mu_1(g'))+\sum_{h\in G}-\mu_2(h)\log(\mu_2(h)) $$ $$= H(\mu_1)+H(\mu_2)$$

and we are done.

0
On

Knowing that the convolution of densities correspond to the sum of independent variables, letting $Z=X+Y$ , and using the chain rule: $H(Z, Y ) = H( Z) + H(Y|Z)=H(Y)+H(Z|Y) $ we have

$$ \begin{array}{rclllll} H(Z) &=& H(Z|Y)&+&H(Y)&-&H(Y|Z) \\ &=&H(X)&+& H(Y) &-&H(Y|Z)\\ &\le&H(X)&+& H(Y) &&\\ \end{array} $$

For the second row we've used $H(Z|Y)=H(X+Y|Y)=H(X|Y)=H(X)$ where the last equality follows from the independence of $X,Y$, and the previous from $H(X+a)=H(X)$

(Notice that here I'm using a different notation than in the question, in $H(X)$ $X$ is the random variable, not the probability measure; that's why $H(\mu_1 \star \mu_2)$ in your notation corresponds to $H(X+Y)=H(Z)$ here.)

BTW: This also says that the equality holds only if $H(Y|Z)=0$, that is, when the sum is one-to-one, when knowing the sum you can know the summands.