Information Encoded by a Probability Density Function

224 Views Asked by At

I want to calculate the information needed to encode a probability density function. For a discrete probability function such as a coin flip, the information would be calculated as follows:

$$S=\sum_n -P_n\log_2P_n$$

So for a coin flip we would have

$$S=-0.5\log_20.5 + -0.5\log_20.5=1$$

So it would take one bit to encode a coin flip (heads or tails, one or zero).

If you want to try to calculate this for a continuous probability function, obviously you cannot use a discrete sum, you have to use an integral. But, when I do this

$$S=\int_{-\infty}^\infty{-P(x)\log_2P(x)}$$

With the normal distribution

$$P(x) = e^{-\frac{x^2}{r^2}}\frac{1}{r\sqrt{\pi}}$$

I get an equation somewhere along the lines of

$$S=\log_2{r}+C$$

which seems right more or less at first, but this means that for some values of $r$ you need negative information to describe the function. I think the problem with what I am doing here is rooted in the fact that for a probability distribution, probabilities are only non-zero for ranges of $x$, such that say $P(3)$ would not really have any kind of significance.

Any help would be very much appreciated.

1

There are 1 best solutions below

0
On

The amount of information (in the Shannon sense, measured in bits) that a continuous source produces is infinite. You need an infinite amount of bits to encode a variable that -for example- is uniform on $[0,1]$. The differential entropy is not the same as the true entropy, it cannot be interpreted in that way.