I read a paper in Information Theory which claims that the following sum of three mutual information expressions
$$ I(Y_1;X_1,X_2,X_3)+I(Y_2;X_2,X_3\mid Y_1,X_1)+I(Y_3;X_3\mid Y_1,Y_2,X_1,X_2)\tag1$$
coincides with the following single mutual expression
$$I(Y_1;X_1,X_2,X_3)$$
if the following Markov Chains are satisfied
$$X_2 X_3\rightarrow Y_1 \rightarrow Y_2 \text{ given } X_1 \tag 2$$
$$X_3\rightarrow Y_2 \rightarrow Y_3 \text{ given } X_1 X_2\tag 3$$
Does anyone know why this is true?
Some thoughts and my reasoning
Is it true that with (2) and (3) the following are true $$I(Y_1; X_2,X_3 |X_1 ) \geq I (Y_2; X_2,X_3 |X_1 )$$ $$I(Y_2; X_3 |X_1 X_2 ) \geq I (Y_3; X_3 |X_1X_2 )$$
In this case I can upperbound (1) as
$$(1) \leq \,\,\, I(Y_1;X_1,X_2,X_3)+\underbrace {I(Y_1;X_2,X_3\mid Y_1,X_1)}_{0 ?}+\underbrace{I(Y_2;X_3\mid Y_1,Y_2,X_1,X_2)}_{0 ?}\tag1$$
Is my reasoning correct?
Thanks
The relationship (2) implies $I(Y_2; X_2, X_3| Y_1, X_1) = 0$. Similarly, the relationship (3) implies $I(Y_3; X_3|Y_1, Y_2, X_1, X_2)=0$. This proves the desired claim.
So, I think your remaining question is "why does relationship (2) imply $I(Y_2; X_2, X_3| Y_1, X_1) = 0$"? (If you can understand that, you can also understand why (3) implies the last term is zero). Here is an explanation: Your relationship (2) is: $$ \mbox{Given $X_1$ we have:} \: \: (X_2, X_3) \rightarrow Y_1 \rightarrow Y_2 $$
This means that if $X_1$ is known, then $Y_1$ depends probabilistically on $(X_2, X_3)$. Further, given $X_1$ is known, then $Y_2$ depends probabilistically on $Y_1$ (so that $Y_2$ is conditionally independent of $(X_2, X_3)$ given that $(X_1, Y_1)$ is known). It follows that: $$ H(Y_2 | Y_1, X_1, X_2, X_3) = H(Y_2|Y_1, X_1) $$ Thus: $$ I(Y_2; X_2, X_3|Y_1,X_1) = H(Y_2|Y_1,X_1) - H(Y_2|Y_1,X_1,X_2,X_3) = 0 $$
Here is an example of random variables $X_1, X_2, X_3, Y_1, Y_2$ that satisfy (2): Define $A, B, C$ as mutually independent random variables uniformly distributed over $[0,1]$. Define:
\begin{align} X_1 &= 0\\ X_2 &= A \\ X_3 &= B \\ Y_1 &= X_2 + X_3 \\ Y_2 &= Y_1 + C \end{align}
I made this simple so $X_1$ is always zero and provides no information. Then we always have: $$ (X_2, X_3) \rightarrow Y_1 \rightarrow Y_2 $$
In particular, once we know $Y_1$, telling us the values of $(X_2, X_3)$ provides no additional information regarding the value of $Y_2$. The value $Y_2$ is conditionally independent of $(X_2, X_3)$ given $Y_1$. Thus: $$ H(Y_2 | Y_1, X_2, X_3) = H(Y_2 | Y_1) = H(C) $$