A limit problem $0\log \cfrac{0}{0}=0$

313 Views Asked by At

How can we show that $$0\log \cfrac{0}{0}=0 ?$$

PS. Not homework. This is taken as a convention in the book Elements of Information Theory by Cover. And the books claims it's by continuity (Page 31). It is used in other places in the book too.

enter image description here

and Page 19

enter image description here

The copy rights of these extracts belongs to the book publisher. Thanks.

5

There are 5 best solutions below

15
On BEST ANSWER

Let $x_n=y_n=1/n$ and $z_n=\mathrm e^{-n^2}$, then $$ x_n\log(y_n/z_n)\to+\infty\quad\text{and}\quad x_n\log(z_n/y_n)\to-\infty, $$ Since $(x_n,y_n,z_n)\to(0,0,0)$ and $(x_n,z_n,y_n)\to(0,0,0)$, this "implies" that $$ 0\log(0/0)=+\infty\quad\text{and}\quad 0\log(0/0)=-\infty. $$ Edit: (addressing some concerns raised by the revised version of the question and some exchanges in the comments) For every positive $x$ and $y$, define $h(x,y)=x\log(x/y)$. Then $h$ has limits on the axes $[x=0,y\gt0]$ and $[x\gt0,y=0]$ since, for every $x\gt0$, $\lim\limits_{y\to0}h(x,y)=+\infty$, and, for every $y\gt0$, $\lim\limits_{x\to0}h(x,y)=0$. But $h$ has no limit at $(0,0)$.

To see this, note that $(x,\mathrm e^{-1/x^2})\to(0,0)$ and $h(x,\mathrm e^{-1/x^2})\to+\infty$ when $x\to0$ and that $(x,x)\to(0,0)$ and $h(x,x)\to0$ when $x\to0$. Likewise, for every positive $c$, $(x,\mathrm e^{-c/x})\to(0,0)$ and $h(x,\mathrm e^{-c/x})\to c$ when $x\to0$.

To sum up, $x\log(x/y)$ could be assigned any value in $[0,+\infty]$ at $(x,y)=(0,0)$. Hence, somewhat more wisely, $x\log(x/y)$ should be assigned no value at $(x,y)=(0,0)$.

10
On

If I remember correctly, in information theory context, Kullback-Leibler divergence is defined for two probability distributions $P$ and $Q$ as follows: $$ D(P||Q)=\sum_{x\in S} P(x)\log \left(\frac{P(x)}{Q(x)}\right). $$ where $S$ is sample space of $x$. In this context, talking about the value of $P(x)$ and $Q(x)$ is both zero for a given $x$, then conventionally they define $0\log\frac{0}{0}=0$.

For the cases where $Q(x)\neq 0$ but $P(x)=0$, this is somehow intuitive as $x\log x\rightarrow 0$ when $x\to 0$. However in case $Q(x)= 0$, this becomes less intuitive.

However saying $0\log\frac{0}{0}=0$ is mathematically speaking and out of its context, meaningless. In any case, these definitions are just conventions for avoiding unpleasant situations and are used often to circumvent the occasional difficulties. But it does not mean in any sens that $0\log\frac{0}{0}=0$ in general. It is just adding points, to domain of a function.

4
On

The issue at hand is the log-sum inequality: if $a_k$ and $b_k$ are sequences of nonnegative numbers, then $$\sum_{k=1}^n a_k \log \left( \frac{a_k}{b_k} \right) \ge \left( \sum_{k=1}^n a_k \right) \log \frac{ \sum_{k=1}^n a_k }{ \sum_{k=1}^n b_k}$$

If $a_j$ and $b_j$ are nonzero for at least one index $j$, the right-hand-side makes sense, and the inequality still holds if for the purposes of this proposition only you take $0 \log \frac 0 0 = 0$ when necessary on the left hand side.

3
On

The other answers are a bit too complicated for my taste. Most people are willing to accept the definition $0\log(0) = 0$, which authors often justify because

$$ x \log(x) \to 0 \text{ as } x \to 0.$$

(Use L'Hopital's rule.) Then we can interpret the ratio as a difference

$$ 0 \log\frac{0}{0} = 0 \log 0 - 0 \log 0 = 0.$$

(Forgive me, oh mathgods, for this blasphemy.)

The real answer is, of course, we take that as definition because it makes sense for information theory. When computing the entropy over a countable space, points with zero probability should be ommitted in your computation; this jives with "$0\log 0 =0$. When computing the relative entropy of events, you should only ignore points of zero probability under both events, and hence the rule $0\log(0/0)=0$.

0
On

I think the author probably had in mind the following statement: $$\lim_{x\rightarrow 0+}x\log \left( \frac{x}{x}\right)=0,$$ which is obvious. The more general statement $$\lim_{x\rightarrow 0^+,y\rightarrow 0^+}x\log \left( \frac{x}{y}\right)=0,$$ doesn't hold, and that seems to be the most relevant statement in that context. To see this, take $y(x)=e^{-\frac{1}{x}}$. As $x\rightarrow 0^+$, we have $y(x)\rightarrow 0^+$, but $$\lim_{x\rightarrow 0^+}x\log \left( \frac{x}{y(x)}\right)=\lim_{x\rightarrow 0^+}x\left[\log x+x\frac{1}{x}\right]\rightarrow 1.$$ My take on it is: just take it as a convention. It is a definition, so it doesn't need justification; it's a matter of convenience.