What is the variance of self-information (or surprisal)?

801 Views Asked by At

The self-information of an outcome $x_i$, or surprisal, is defined as: $$ I(x_i)=-\log P(x_i), $$

where $P$ means probability. This way, the Shannon entropy can be seen as the "average" or "expected" surprisal: $$ H=-\sum_i P(x_i)\,\log P(x_i). $$

This is quite intuitive, and it helps understanding what Shannon entropy really measures.

But what is instead the variance of the surprisal? Does this quantity: $$ \sum_i P(x_i)\,\big(\log P(x_i) \big)^2-H^2 $$ have any statistical meaning? Is it equivalent to anything known, or at least can it be written in a clearer way?

Thanks.