Mutual information expressed as Kullback-Leibler divergence

267 Views Asked by At

My lecturer defines the mutual information: $$ I(X;Y\mid Z) = D_{KL}\big(p(X,Y\mid Z)\parallel p(X\mid Z)\;p(Y\mid Z)\big)$$ Is this correct? I feel like it doesn't really make sense to say that; instead I think it should be the expected value of this over Z.

Also, is $I(X;Y|Z)$ the same as $I(X,Y|Z)$?

2

There are 2 best solutions below

0
On BEST ANSWER

Mutual information is often written $I(X;Y)=D_{KL}(p(X,Y)||p(X)p(Y))$. Your instructor has provided a slightly generalised version which depends on some other variable, $Z$.

You are free to take expectations over this. Note that if $Z$ is constant, then that leads to the usual definition.

0
On

No, not quite that.   Conditional Mutual Information is: $$\begin{align} I(X;Y\mid Z) & =D_{KL} (p(X,Y,Z)\parallel p(X\mid Z)\;p(Y\mid Z)\;p(Z)) \\[2ex] & = \sum_{z\in Z} p_{_Z}(z)\; D_{KL}(p(X,Y\mid Z=z)\parallel p(X\mid Z=z)\;p(Y\mid Z=z)) \\[3ex] & = \sum_{x\in X}\sum_{y\in Y}\sum_{z\in Z} p_{_Z}(z)p_{_{X,Y\mid Z}}(x,y\mid z)\log_2\frac{p_{_{X,Y\mid Z}}(x,y\mid z)}{p_{_{X\mid Z}}(x\mid z)\;p_{_{Y\mid Z}}(y\mid z)} \end{align}$$