Differential entropy from samples: $\int_{-\infty}^{\infty} p(x) \log(p(x)) dx$

331 Views Asked by At

Question

I am trying to calculate the differential entropy of a probability distribution with an unknown probability density function $p(x)$ but for which I have multiple samples. The random variable, $X$, is multivariate: an image composed of dependent univariate random variables (pixels) $X_{x,y}$.

The differential entropy is given by: $$-\int_{-\infty}^{\infty} p(x) \log(p(x)) dx$$

So far, I have tried computing the anti-derivate by hand, using wolfram alpha, using Lambert W functions, all with no success. By knowing the anti-derivative, it might be possible to use Monte-Carlo integration (or other approximate methods) to approximate the integral using samples.

I am posting in a mathematics forum so that maybe someone has the tools to solve this integral or at least explain why it can't be solved. Intuitively, since we know that $\int_{-\infty}^{\infty} p(x) dx = 1$ it feels like there should be a way.

Related approaches:

There are two main approaches in the literature:

  1. To bin/discretize the continuous variable, compute the histogram and use the discrete entropy as a proxy;
  2. Use a kernel method to approximate the density at one point with a normal distribution with mean equals sample mean and a chosen fixed variance.

While these approaches are sensible for univariate variables and even for (mostly) independent multivariate variables, they will not work well in the case of images (strong dependencies).