How to calculate entropy from a set of samples?

9.3k Views Asked by Bumbble Comm At 29 Mar 2026 - 3:43

entropy (information content) is defined as:

$$ H(X) = \sum_{i} {\mathrm{P}(x_i)\,\mathrm{I}(x_i)} = -\sum_{i} {\mathrm{P}(x_i) \log_b \mathrm{P}(x_i)} $$

This allows to calculate the entropy of a random variable given its probability distribution.

But, what if I have a set of scalar samples and I want to calculate their entropy? In this case the probability density function is not available, but maybe there is a formula to get an approximation (as in the sample mean)? Does it have a name?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 24 Jul 2015 - 3:17

The most natural (and almost trivial) way to estimate (not calculate) the probabilities is just counting: $$\hat{p_i}=\frac{n_i}{N}$$ where $p_i$ is the probabilty of symbol $i$, $\hat{p_i}$ its estimator, $n_i$ the counting of ocurrences of symbol $i$, and $n$ the total of samples. Then you plug this estimator into the entropy formula.

However, this might not be a fair estimator of the entropy rate of your source, because it does not take into account the dependencies between succesive symbols. It only makes sense if the source emits independent symbols - or if you are only interested in the marginal entropies (and provided that your source is stationary - ergodic, actually).

How to calculate entropy from a set of samples?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in APPROXIMATION

Related Questions in INFORMATION-THEORY

Related Questions in ESTIMATION

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions