An Upper Bound for Entropy $H(X)$ in Terms of $\mathbb{E}(\log X)$

144 Views Asked by At

Suppose that $X:\Omega\to\mathbb{N}$ is a discrete random variable with PMF $p:\mathbb{R}\to\mathbb{R}$, where $\mathrm{supp} p = \mathrm{range} X = \mathbb{N}$. I want to prove that

If $\mathbb{E}(\log_2 X) < \infty$ then $H(X) < \infty$.

The first thing that comes to mind is to find an upper bound for $H(X)$ in terms of $\mathbb{E}(\log_2 X)$. Expanding the expression for each of these we have

$$ H(X) = \sum_{i=1}^{\infty}p(i)\log\frac{1}{p(i)}, \qquad \mathbb{E}(\log_2 X) = \sum_{i=1}^{\infty}p(i)\log_2 i. $$

I have no idea for going on.

2

There are 2 best solutions below

2
On BEST ANSWER

If your only purpose is to show the implication $\mathbb{E}[\log_2 X] < \infty \implies H(X) < \infty$, here is a quick argument:

Let $\mathcal{I} = \{ i \in \mathbb{N}_1 : p(i) \leq \frac{1}{ei^2} \}$, where $\mathbb{N}_1 = \{1, 2, \ldots\}$ is the set of positive integers. Then by noting that $x \mapsto x\log_2(1/x)$ is increasing for $x \in [0, 1/e]$,

\begin{align*} \sum_{i \in \mathcal{I}} p(i) \log_2\frac{1}{p(i)} &\leq \sum_{i \in \mathcal{I}} \frac{1}{ei^2} \log_2(ei^2) \\ &\leq \sum_{i=1}^{\infty} \frac{\log_2 e + 2\log_2 i}{ei^2} < \infty. \end{align*}

On the other hand, we have $ei^2 > \frac{1}{p(i)}$ for $i \notin \mathcal{I}$. So by using that $x \mapsto \log_2 x$ is increasing,

\begin{align*} \sum_{i \notin \mathcal{I}} p(i) \log_2\frac{1}{p(i)} &\leq \sum_{i \notin \mathcal{I}} p(i)[\log_2 e + 2\log_2 i] \\ &\leq \sum_{i=1}^{\infty} p(i)[\log_2 e + 2\log_2 i] \\ &= \log_2 e + 2 \mathbb{E}[\log_2 X] < \infty. \end{align*}

Therefore $H(X) < \infty$.

(Remark. The above argument actually proves an inequality of the form $H(X) \leq 2\mathbb{E}[\log_2 X] + C$ for some absolute constant $C$. Of course, this bound is far from being optimal.)

1
On

The paper Wyner, A. D., An upper bound on the entropy series, Information and control, 20, pp. 176-181 (1972) provides a general (nonlinear) upper bounds for the entropy $$H(\mathbf{p}):=\sum_n p_n\log(1/p_n)$$ in terms of the expectation $L(\mathbf{p})=\sum_n p_n\log n$.

The paper uses near analytical arguments that I think would be nice to discuss at MSE. In the remainder of this posting, I will elaborate a little further and derive a proof of the statement in the OP as along the way.


Consider the set $\mathcal{A}$ of monotone nondecreasing nonnegative sequences $\boldsymbol{a}$ such such that $\phi_a(s):=\sum_n e^{-sa_n}<\infty$ for some $s>0$. Such a sequence necessarily satisfies $a_n\xrightarrow{n\rightarrow\infty}\infty$.

For $\boldsymbol{a}\in\mathcal{A}$, set $s_a:=\inf\{s>0:\phi_a(s)<\infty\}$. Then

  1. $(\lambda_a,\infty)\subset\{s>0: \phi_a(s)<\infty\}$.
  2. Clearly $\phi_a:(s_a,\infty)\rightarrow(0,\infty)$ is strictly monotone decreasing.
  3. Extended as a complex function on $(s_a,\infty)\times\mathbb{R}$, $\phi_a$ is analytic.
  4. $\phi_a$ is strictly convex (for example, $\phi''_a>0$).

Define the function $$\psi_a(s)=\frac{1}{\phi_a(s)}\sum_n a_n e^{-a_n s}$$

  1. By (2), $\psi_a$ finite and differentiable on $(s_a,\infty)$ ( in fact, $\psi_a=-\frac{\phi'_a}{\phi_a}$ on $(s_a,\infty)$).

Notice that $$\psi'_a(s)=\frac{(\phi'_a(s))^2-\phi'_a(s)\phi''_a(s)}{\phi^2_a(s)}=\frac{\Big(\sum_na_ne^{-sa_n}\Big)^2-\Big(\sum_ne^{-a_ns}\Big)\Big(\sum_na^2_ne^{-a_ns}\Big)}{\Big(\sum_n e^{-sa_n}\Big)^2}<0$$ by Cauchy-Shwartz inequality (strict inequality). Hence,

  1. $\psi_a$ is strictly monotone decreasing on $(s_a,\infty)$.

Let $\ell$ be the largest integer such that $a_\ell>a_1$ (notice that $\ell\geq2$ is well defined since $a_n\nearrow\infty$ as $n\rightarrow\infty$). Then $$a_1\leq \psi_a(s)< \frac{(\ell-1)a_1e^{-a_1s}+\sum_{n\geq\ell}a_ne^{-a_ns}}{(\ell-1)e^{-a_1s}}=a_1+ \Big(\sum_{n\geq\ell}\frac{a_n}{\ell-1}e^{-(a_n-a_1)s}\Big)$$

  1. By monotone convergence, $\lim_{s\rightarrow\infty}\psi_a(s)=a_1$.

Let $p_a:=\lim_{s\rightarrow s_a}\psi_a(s)$. It follows from (6), (7) that

  1. $\psi_a$ is a smooth and strictly monotone decreasing function from $(s_0,\infty)$ onto $(a_1, p_a)$ and so, it admits an smooth strictly monotone decreasing inverse $\psi^{-1}_a:(a_1,p_a)\rightarrow(s_a,\infty)$.

The following result gives an upper bound for the entropy.

Theorem (Wyner) Suppose $\mathbf{p}$ is a probability measure on $(\mathbb{N},2^{\mathbb{N}})$ for which there is a sequence $\boldsymbol{a}\in\mathcal{A}$ such that $\mathbb{E}[a_X]=\sum_na_np_n\leq\beta$, where $a_1< \beta<p_a=\lim_{s\searrow s_a}\psi_s(s)$. Then $$H(\mathbf{p})\leq\log\Big(\phi_a\big(\psi^{-1}_a(\beta)\big)\Big)+\beta\psi^{-1}_a(\beta)$$

Proof: Define the probability measure $\mathbf{p}^*$ as $$ p^*_n:=\frac{e^{-a_n\psi^{-1}_a(\beta)}}{\phi_a(\psi^{-1}_a(\beta))}$$ A simple calculation yields \begin{align} H(\mathbf{p}^*)=\sum_np^*_n\log(1/p^*_n)&=\log\big(\phi_a(\psi^{-1}_a(\beta))\big) +\psi^{-1}_a(\beta)\sum_na_np^*_a\\ &=\log\big(\phi_a(\psi^{-1}_a(\beta))\big) +\psi^{-1}_a(\beta)\psi_a(\psi^{-1}_a(\beta))\\ &= \log\big(\phi_a(\psi^{-1}_a(\beta))\big) +\psi^{-1}_a(\beta)\beta\\ &\geq \log\big(\phi_a(\psi^{-1}_a(\beta))\big) +\psi^{-1}_a(\beta)\sum_na_np_n\\ &=-\sum_np_n\log(p^*_n) \end{align}

Consequently \begin{align} H(\mathbf{p})-H(\mathbf{p}^*)\leq \sum_np_n\log\Big(\frac{p_n}{p^*_n}\Big)\leq \sum_np_n\Big(\frac{p^*_n}{p_n}-1\Big)=0 \end{align} where the latter inequality follows from the fact that $1+x\leq e^x$ for all $x\in\mathbb{R}$. $\blacksquare$


Back to the OP's question

Suppose $\mathbf{p}$ is a probability measure on $(\mathbb{N},2^{\mathbb{N}})$ for which $$\sum^\infty_{n=1}p_n\log n <\infty$$

The sequence $a(n)=\log n$ belongs to $\mathcal{A}$. In fact $\phi_a(s)=\sum_n e^{-sa_n}=\sum_n\frac{1}{n^s}$ is finite on $(1,\infty)$. Here $s_a=1$, $$a_1=\lim_{s\rightarrow\infty}\psi_a(s)=\log 1=0$$ and $$p_a:=\lim_{s\rightarrow\infty}\psi_a(s)=\infty$$ Indeed, for any integer $M>0$ and $s>1$ \begin{align} \psi_a(s)\geq\frac{1}{\phi_a(s)}\sum_{n\geq e^M}\log n\frac{1}{n^s}\geq M\frac{1}{\phi_a(s)}\sum_{n\geq e^M}\frac{1}{n^s}\xrightarrow{s\rightarrow1+}M \end{align} since $\lim_{s\rightarrow1+}\phi_a(s)=\sum_n\frac1n=\infty$ by monotone convergence.


Final remarks:

Under additional assumptions on $\mathbf{p}$ convergence of logarithmic moment is equivalent to finiteness of entropy. To be more precise, we have

Corollary: If the $\mathbf{p}$ satisfies the additional assumption that $0<p_{n+1}\leq p_n$ for all $n\in\mathbb{N}$, then $H(\mathbf{p})<\infty$ iff $\sum_np_n\log n<\infty$

Necessity has been discussed at MSE before. See also Theorem 1 in Wyner's paper.