How can I connect this "entropy functional" definition with information theoretic entropy?

61 Views Asked by At

This might seem very obvious, but I am thrown for a loop, and a search of StackExchange did not help. In this paper, we have the definition

The entropy $\text{Ent}_\mu(f)$ of a $\mu$-integrable function $f$ is defined to be $\text{Ent}_\mu(f):= E_\mu(f \log f) - E_\mu(f) \log[E_\mu f ] $.

This doesn't seem to match the information theoretic definition of entropy which I am familiar with, where with probability measure $p$, we have

$H_p(X) = E_p[ -\log[p(X)] ]$.

This doesn't line up if we take some trivial case where our measure $\mu$ is just $p$, and $f$ is just the random variable $X$. What am I missing here? Are these referring to the same concept in two different contexts? I'm very weak at understanding the measure theory notation, and I'm struggling to understand this.

1

There are 1 best solutions below

0
On BEST ANSWER

The entropy for a discrete random variable $X$ of law $p$ can be explicited as $$H(X) = - \sum_{i=1}^n p(x_i) \log (p(x_i)$$

On the other hand, if $f$ is a probability density function, one has $E_\mu(f) = \int f d\mu = 1$, hence $\log (E_\mu(f)) = 0$. One hence gets $$\operatorname{Ent}_\mu(f) = E_\mu(f \log (f)) $$

edit (cf page 20 of the paper linked above) In fact, $H(p) = -\operatorname{Ent}_\mu (p)$, hence both definition coincide.