Mutual information of two random variables with event set 1 >> event set 2

240 Views Asked by At

I have the following problem: Let's say we have three random variables $X, Y$ and $Z$. I know the mutual information $I(X,Y)$ (see below).

First let's calculate how much the mutual information can increase if we don't have $Y$ but the joint probability $YZ$:

$$ I(X, YZ) - I(X,Y) =: I(X,Y|Z)\\ \Leftrightarrow I(X,YZ)=I(X,Y)+I(X,Y|Z)\\ =I(X,Y)+\sum_z p(Z=z)I(X,Y|Z=z) $$

But as $I(X,Y)$ is bounded by $\min(\log(X),\log(Y))$, so is $I(X,Y|Z=z)$ (just look for the definition of conditional mutual information if you don't believe me).

Moreover, as my problem arise from a problem of applied mathematics, my first requirement states $\frac{|Y|}{|X|}\rightarrow 0$ in the sense of the size of the event sets. A second requirement states $I(X, Y) \rightarrow 0$. So what I get in the limit is:

$$ I(X,YZ) \leq \log(Y) $$

However, I am not satisfied. My intuition says me that from $\frac{|Y|}{|X|}\rightarrow 0$ follows that $I(X,Y|Z)$ is not only less or equal $\log(Y)$ but vanishes in the limit as there shouldn't be much correlation between a small $|Y|$ and a much, much bigger $|X|$:

$$ I(X,YZ) \rightarrow 0~\text{if}~\frac{|Y|}{|X|}\rightarrow 0 \wedge I(X, Y) \rightarrow 0 $$

But I don't know how to show this in a mathematical correct way and due to this I am not 100% sure if this is even right.

Maybe someone knows the answer to my problem? Feel free to ask if something is stated unclear.

Edit: The event sets of the random variables are finite and discrete.

1

There are 1 best solutions below

1
On

You have made a mistake in the first equation. By the chain rule of mutual information $$ I(X;YZ) = I(X;Y) + I(X;Z|Y). $$ One way to realize that there's an error in your equation is that $I(X;Y) + I(X;Y|Z)$ is symmetric when you substitute $X$ and $Y$, but $I(X;YZ)$ is not.

Further, let $X_n = Z_n$ be uniformly distributed on $\{1, \ldots, n\}$ and let $Y_n$ be independent from $X_n$ and constant. Then $\frac{|Y_n]}{|X_n|}\to 0$, $I(X_n;Y_n) = 0$, and $I(X_n;Y_n Z_n) = I(X_n;Z_n) = \log n$.