Why are the entropies of magnitude and squared magnitude of Gaussian vector different?

274 Views Asked by At

I consider a circular complex normal variable $x = x_r + i x_i \sim \mathcal{CN}(0, \sigma^2)$. I know that the PDFs of the magnitude $r$ of that variable and the squared magnitude $s = r^2$ are given by a Rayleigh distribution $f(r;\sigma) = \frac{x}{\sigma^2} e^{-x^2/(2\sigma^2)}$ and exponential distribution $f(s,\sigma) = \frac{1}{2\sigma^2}e^{-s/(2\sigma^2)}$, respectively (c.f. for example http://www.randomservices.org/random/special/Rayleigh.html).

Now, if I look at the entropies of these PDFs (wiki) I find that they scale with $\log(\sigma)$ and $\log(\sigma^2)$. Here, I do not care about terms that do not depend on $\sigma$.

How can this difference be explained intuitively? In my intuition both representations should carry the same Shannon information as they are just different representations of the same quantity, completely characterized by $\sigma$. Note that - in general - a difference of two entropies will not absorb the scaling: $\log(\sigma_1)-\log(\sigma_2)\neq \log(\sigma_1^2)-\log(\sigma_2^2)$.

2

There are 2 best solutions below

3
On

I think the origin of the distributions doesn't matter here, so I'll simply answer on the difference in entropy between the distribution of a quantity $r$ and that of its square $s = r^2$.

I personally don't see why you think the entropies should be the same. It's true $r$ and $s$ may just represent the same physical variable. But they do it in two different ways. You can see this under different aspects: entropy, mathematically, is just a functional of the PDF, which may be more, or less, "spread out" for $r$ than for $s$, since they're two completely different PDFs.

Another way of seeing this is in the fact that you may lose information in taking the square, for example, information about the sign! Consider the discrete distribution with

$$ p(x=1) = 0.5; \qquad p(x=-1) = 0.5$$ Its entropy is $S[p] = 1$ bit. But the distribution of $x^2$ is just

$$ P(x^2=1) = 1$$

which has entropy $S[P] = 0$.

I hope this example makes it clear that there are simple counterexamples.

EDIT From the mathematical point of view, the transformation one applies to get the distribution $f_s(s)$ from the distribution $f_r(r)$ is designed to keep the probability measure the same, i.e. to grant

$$ f_r(r)dr = f_s(s)ds$$

However, this doesn't guarantee at all that

$$ f_r(r)\log f_r(r)dr = f_s(s)\log f_s(s)ds $$

which is necessary for the entropies to be equal.

3
On

The main reason why this doesn't work is that differential entropy (i.e. entropy for continuous random variables.) is not invariant to scale. That is H(X) != H(2X) (you can check this by calculating the differential entropy of Unif(0,1/2) and Unif (0,1) )

Similarly it is not invariant to aribtrary transformation of the random variable. Thus the entropies that you calculate are different and should be.

So in general differential entropy should not be interpreted as the absolute information content of system.

You can read more at: https://en.wikipedia.org/wiki/Differential_entropy

You should use Limiting density of discrete points ( https://en.wikipedia.org/wiki/Limiting_density_of_discrete_points )