higher moments of entropy... does the variance of $ \log x $ have any operational meaning?

632 Views Asked by At

The Shannon entropy is the average of the negative log of a list of probabilities $ \{ x_1 , \dots , x_d\} $, i.e. $$ H(x)= -\sum\limits_{i=1}^d x_i \log x_i $$ there are of course lots of nice interpretations of the Shannon entropy. What about the variance of $ -\log x_i $ ? $$ \sigma^2 (-\log x)=\sum\limits_i x_i (\log x_i )^2-\left( \sum\limits_i x_i \log x_i \right)^2 $$ does this have any meaning / has it been used in the literature?

1

There are 1 best solutions below

0
On BEST ANSWER

$\log 1/x_i$ is sometimes known as the 'surprise' (e.g. in units of bits) of drawing the symbol $x_i$, and $\log 1/X$, being a random variable, has all the operational meanings that come with any random variable, namely, entropy is the average 'surprise'; similarly, higher moments are simply higher moments of the surprise measure of $X$.

There is indeed a literature on using the variance of information measures (not of surprise in this case, but of divergence), here are two good places to get started on a concept called 'dispersion': http://people.lids.mit.edu/yp/homepage/data/gauss_isit.pdf http://arxiv.org/pdf/1109.6310v2.pdf

The application is clear. When you only know the expected value of a random variable, you know it at first order. But when you need to get tighter bounds you need to use higher moments.