This is taken from an old exam question I struggle to understand.
Let $X,Y$ be any random variables. One has to find out whether $I(X;\max{(Y,0)}) \leq I(X;Y)$ holds always, sometimes or never.
From my understanding we have
- $I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X)$
- $I(X;\max{(Y,0)}) = H(\max(Y,0))- H(\max(Y,0)|X) = H(X) - H(X | \max(Y,0))$
Furthermore using the definition of $H(Y) = - \sum_{y \in \mathcal{Y}}p(y)\log(p(y))$ I conclude that $H(Y) \geq H(\max(Y,0))$ since:
- the possible values $\max(Y,0)$ can take is smaller than the values the original $Y$ could take if some values of $Y$ were smaller than zero meaning we will have fewer summands.
- we have equality if all values of $Y$ are strictly bigger than $0$ because then we have $\max(Y,0)=Y$
- if all values of $Y$ are smaller than $0$ we automatically have $H(\max(Y,0)) = 0$ and the inequality also holds.
However from this point onwards I am stuck. The problem is that I have no clue to show that for example $H(X|\max(Y,0)) > H(X|Y)$ (if that even is true).
Since this approach didn't really help a lot I thought I would look at finding an estimate for the equivalent $I(X;\max{(Y,0)}) = H(X) - H(X | \max(Y,0))$.
We would now have to find a relation between $H(X | \max(Y,0))$ and $H(X | Y)$.
By definition we have $H(X|Y) = \sum_y p(y)\cdot H(X|Y=y)$ where $H(X|Y=y) = -\sum_{x} p(x|y)\log p(x|y)$.
However I struggle to continue from here.
Any help in solving this question is appreciated, even if its just intuitive help.
Intuitively I think that the statement is always true and since I can potentially learn a lot more about $X$ if I know $Y$ instead of knowing $\max{(Y,0)}$
The statement is always true and it is concluded from the Data Processing Inequality. More precisely, let $Z=\max(Y,0)$. Then, the Markov chain $X-Y-Z$ holds and hence, $I(X;Z) \leq I(X;Y)$.