Mutual information of a random subset

54 Views Asked by At

Suppose we sample the following two random variables, for some large integer $n$:

  1. Let random variable $X_1$ be a uniformly random subset of $m$ elements chosen from set $[n]:=\{1,\dots,n\}$, where $m < n$.
  2. Let random variable $X_2$ be a uniformly random subset of $k$ elements of $X_1$, where $k<m$.

In other words, $X_2$ is a random subset of $X_1$, and $X_1$ is a random subset of $[n]$.

My Question: Does it hold that, for any random variable $Y$ that determines $X_2$ (i.e. $I[ X_2 : Y ] = H[X_2]$), the amount of information that $Y$ contains about $X_2$ is never larger than the amount of information that it reveals about $X_1$? Formally, $$ I[ X_2 : Y ] ≤ I[ X_1 : Y ] ? $$

I think this should be true, because whatever we learn from $Y$ about $X_2$ is also "useful" information about $X_1$. But how can I formalize this intuition?

Update: Assume that $Y$ that determines $X_2$, i.e. $I[ X_2 : Y ] = H[X_2]$.

1

There are 1 best solutions below

0
On

You can see you conjecture is false by looking at the special case $Y=X_2$. Then $I(X_1;Y)= H(X_2) - H(X_2|X_1) \le H(X_2)=I(X_2;Y)$

In general:

Since $X_2 = g(Y)$, $H(X_1 | Y) \le H(X_1 | X_2)$

Then

$$\begin{align} I(X1;Y) &= H(X_1) - H(X_1 | Y) \\ &\ge H(X_1) - H(X_1 | X_2) \\ &= H(X_2)- H(X_2|X_1) \\ &=I(X_2;Y) - H(X_2|X_1) \\ &=I(X_2;Y) - \log \binom{m}{k} \end{align}$$

That's all you can say.