Information Diagram: definition of multivariate mutual information

356 Views Asked by At

As defined on Wikipedia, Venn diagram

enter image description here

shows additive and subtractive relationships among various information measures associated with variables $X$ and $Y$. From this picture it is easily to see that $I(X;Y)=H(X)+H(Y)-H(X,Y)$. When the number of variables increases, things got tough. For example, given:

enter image description here

I think that $I(X;Y;Z)$ could be: $$I(X;Y;Z)=H(X)+H(Y)+H(Z)-[H(X,Y)+H(X,Z)+H(Y,Z)]+H(X,Y,Z)$$ Is it true?

What happens for four, five (and so on) variables? Is there a formula that answer this questions?

1

There are 1 best solutions below

0
On BEST ANSWER

Yup, good intuition. This is called multivariate mutual information, or interaction information. In general $$I(X_1;X_2;\dots;X_n) = \sum_{T \subset [1:n]} (-1)^{|T|+1} H(X_T) $$ where $H(X_T)$ is the joint entropy of the variables $\{X_i\}_{i \in T}.$

Often the definition is given inductively. Notice that $$ I(X_1;X_2;X_3) = I(X_1;X_2) - I(X_1;X_2|X_3).$$ One can then posit a natural generalisation (within the formal structure of information theory) to conditional $3$-MMI $I(X_1;X_2;X_3|X_4)$, and then $4$-MMI $I(X_1;X_2;X_3;X_4) = I(X_1;X_2;X_3) - I(X_1;X_2;X_3|X_4)$, and then daisy chain up to $n$-MMI. Surprisingly this yields the same expression as the above.

The good thing about this definition is that it gives a natural interpretation - so, the 3-MMI is capturing how much more information $X_1$ has about $X_2$ compared to when one knows $X_3$ as well (this is the reason the quantity shows up in studies of flow of information and the like, e.g. in some of Pearl's work on Bayesian networks). On the other hand, this definition hides the (surprising, when interpreted this way) symmetry of the quantity that is evident in the entropic decomposition. Another thing that becomes clear from the second way of writing it is that MMI can be negative.