Building intuition behind information diagram

149 Views Asked by At

I am trying to build intuition behind various entropies depicted in below diagram:

enter image description here

I am trying to compare it with probability distribution venn diagram:
enter image description here

Also I am trying to keep definitions in consideration while understanding I-diagram:

  1. The entropy of a random variable is the average level of "information"inherent in the variable's possible outcomes:

$$H(X)=E[I(X)]=E[−\log⁡(P(X))]=-\sum_{i=1}^nP(x_i)\log{P(x_i)}$$

  1. The conditional entropy quantifies the amount of information needed to describe the outcome of a random variable $Y$ given that the value of another random variable $X$ is known.

$$H(Y|X)=-\sum_{x\in X,y\in Y}p(x,y)\log \frac{p(x,y)}{p(x)}$$

  1. Joint entropy is a measure of the uncertainty associated with a set of variables.

$$H(X,Y)=-\sum_{x\in X,y\in Y}P(x,y)\log_2[P(x,y)]$$

  1. Mutual information quantifies the "amount of information" obtained about one random variable through observing the other random variable.

$$I(X,Y)=\sum_{x\in X,y\in Y}P(x,y)\log_2\left(\frac{p(x,y)}{p(x)p(y)} \right)$$

Doubts

  1. I am able to intuitively understand labels of probability distribution venn diagramm, but not of information diagram. I feel its possible to have intuition for probability distribution venn diagram because of their simple meaning. For example, even $P(A)$ means number of occurrences of A (out of population). But $H(X)$ is itself an expectation. Hence its impossible to have such clear cut intuition for information diagram. And we need to kind-of memorize the labels. Or is there any simpler intuitions behind information diagram labels?

  2. How are those conditional entropy $H(X|Y)$ and $H(Y|X)$ labels came? Especially given we cant label any part of venn diagram with conditional probability.

  3. Also how its labeled $H(X|Y)$, given that part is outside $H(Y)$?

  4. Can we draw any mathematical intuition for informaton digram label $I(X,Y)$?

  5. Wikipedia does not give any good "logical" definition joint entropy. Can we have one which will lead to the label $H(X,Y)$? Also is it parallel to joint distribution, especially given that we cannot label any part of venn diagram with joint distribution?

1

There are 1 best solutions below

3
On

The information diagram is just a way to represent the fundamental equations $$ H(X,Y)=H(X)+H(Y|X), $$

and

$$ I(X;Y)=H(X)-H(X|Y), $$

and their versions with $X$ and $Y$ exchanged. Nothing more than that.