Given a Markov chain $X \rightarrow Y \rightarrow Z$, the data processing inequality bounds the mutual information
$$I(X;Z) \leq \min \big( I(X;Y),I(Y;Z) \big)$$
However, it seems intuitive that we should be able to do better through some combination of these quantities. For example, if $X$ is a uniform binary random variable, $Y$ flips $X$ with probability $1/4$, and $Z$ flips $Y$ with probability $1/4$, then $I(X;Z)$ should be strictly lower than either $I(X;Y)$ or $I(Y;Z)$, right? More broadly, if I have a Markov chain of successive corrupting channels
$$X_1 \rightarrow X_2 \rightarrow \cdots \rightarrow X_k$$
I would imagine that $I(X_1;X_k)$ is much lower than any one $I(X_j;X_{j+1})$. However, I'm not aware of any such result, nor am I clear on why such a result does not exist. Can someone either provide a similar result or explain why one does not exist?