Is there a means of calculating the entropy of a series of bits that takes correlation into account?

Question

Is there a means of calculating the entropy of a series of bits that takes correlation into account?

761 Views Asked by Bumbble Comm At 29 Mar 2026 - 10:49

A common expression for calculating the entropy of a series of bits appears to be:

$$-\sum_{i}{P\left (x_i\right )log_b\left (P \left (x_i\right )\right )}$$

This seems to fail (or my intuition of entropy is simply incorrect) in the case where the bits are highly correlated or patterned, e.g. 0000000011111111 would maximize the value of entropy calculated by this expression. Another example is 0101010101010101.

In both of these cases, the data is highly compressible, but the entropy is high according to the given expression. Is there some definition of entropy that takes this into account? Or am I looking for something else entirely?

Original Q&A

There are 3 best solutions below

Bumbble Comm On 28 Mar 2015 - 4:13

In both of these cases, the data is highly compressible [...]. Is there some definition of entropy that takes this into account?

You could be looking for the Kolmogorov Complexity. This is the shortest length to which a string can be compressed. It has connections with entropy.

Bumbble Comm On 28 Mar 2015 - 4:21

There are multiple models of entropy that take into account correlation between the bits. The model you have is a first-order model according to http://www.data-compression.com/theory.html#entropy or Shannon's 1948 paper. A second- or third-order model would account for the fact that some two-letter or three-letter combinations are more likely than others.

**Bumbble Comm** · Accepted Answer

There seems to be some misunderstandings in your question. First, when you speak of "the case where the bits are highly correlated" ... are you thinking of a source that produces always (or with high probability) such patterns ? Or are you thinking of particular realizations? In the second case, you must understand that the traditional definition of entropy does not apply to realizations (single events) but to a source that follows a given probability distribution (hence, it does not make sense to speak of the "entropy of a message", only of the "entropy of the source").

If, instead, you are thinking of a source that produces correlated bits, then the definition still applies. But then you must consider the probabilities of the full messages. That is, you should compute the entropy of $X^n=(x_1,x_2 \cdots x_n)$ , which you'd compute as $$H(X^n)=-\sum p(X^n) \log p(X^n)$$ where the sum is over all the possible messages. Then, if you want, you can compute the "rate of entropy" (entropy per simbol) as $H_r=H(X^n)/n$

Only when the bits are independent (and identically distributed) we'd get $H_r = H(x_1)$. This is all explained in any textbook, as well as the original paper of Shannon.

A common model to simplfy the computation of the entropy rate is to assume some Markov chain; in the case of a stationary first order Markov chain (the present depends only on the immediate past, via a fixed transitions matrix of probabilities), then the entropy rate (and hence the joint entropy) can be computed from the transition matrix. To assume $H_r = H(x_1)$ would be equivalent to assume a zero order Markov model - which sometimes might be a good approximation, but sometimes (when the symbols are highly correlated) might not.

Is there a means of calculating the entropy of a series of bits that takes correlation into account?

There are 3 best solutions below

Related Questions in INFORMATION-THEORY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions