Show that $I(X_1 X_2 ; Y) = I(X_1;Y) + I(X_2; Y | X_1)$.
I know that the mutual information between two RV's $Z$ and $W$ is defined as:
$$I(Z;W)=H(Z)-H(Z|W)=H(W)-H(W|Z).$$
Therefore, for $I(X_1 X_2; Y)$, we can re-express it as:
$$I(X_1 X_2; Y) = H(X_1X_2)-H(X_1X_2|Y)$$
By using the chain rule with $H(X_1 X_2)$, we can re-express this term as:
$$H(X_1 X_2)=H(X_1) + H(X_2|X_1)$$
because $P(X_1,...X_n)=\prod_{i=1}^{n} P(X_i|\bigcap_{j=1}^{n-1}X_j)$, which for this case: $n=2$, $P(X_1 X_2) = P(X_1) + P(X_2|X_1)$.
However, I am not sure how to separate $H(X_1 X_2| Y)$. I thought that by using the fact that $P(Z,W)=P(Z|W)P(W)$, then:
$$H(X_1 X_2)=H(X_1 X_2 | Y)+H(Y)$$
So, $H(X_1X_2|Y)=H(X_1 X_2)-H(Y)=H(X_1) + H(X_2|X_1) +H(Y)$. Hence,
$$I(X_1 X_2;Y)=H(X_1)+H(X_2|X_1)-H(X_1)-H(X_2|X_1)-H(Y)=-H(Y),$$
which does not make sense.
I am pretty sure about the expansion of the first term, but not the second one, which is why I think I obtain the wrong result.
Could anyone give me a bit of guidance on this?
Your solution is almost complete but it went south when you used this identity $$ H(X_1,X_2)=H(X_1,X_2 | Y)+H(Y)$$ which is incorrect, the correct version is $H(X_1,X_2,Y)=H(X_1,X_2 |Y)+H(Y)$ but we do not need that at all. I use the generic definition
$$I(X_1,X_2;Y) = H(X_1,X_2)-H(X_1,X_2|Y)$$
Now we have $H(X_1,X_2) = H(X_1)+H(X_2|X_1)$ and $H(X_1,X_2|Y) = H(X_1|Y) + H(X_2|X_1,Y)$. Substitute these and we get
$$I(X_1,X_2;Y) = H(X_1)+H(X_2|X_1) - \left( H(X_1|Y) + H(X_2|X_1,Y) \right) \\ = \left( H(X_1) - H(X_1|Y) \right) + \left(H(X_2|X_1) - H(X_2|X_1,Y) \right) \\ = I(X_1;Y) + I(X_2;Y|X_1) $$
All the identities can be proved using Bayes theorem and entropy definition. For example consider the proof of $H(X_1,X_2|Y) = H(X_1|Y)+H(X_2|X_1,Y)$ as follow. assuming $X_1,X_2$ take values form the alphabets $S_1,S_2$ respectively (I use $\log$ in definitions so the unit of measurement depends on the base of $\log$ but the proof is the same)
$$H(X_1,X_2|Y) = \sum_{S_1}\sum_{S_2}P(X_1,X_2|Y)\log P(X_1,X_2|Y) \\ =_{(a)} \sum_{S_1}\sum_{S_2}P(X_1|Y)P(X_2|X_1,Y)\log P(X_1|Y)P(X_2|X_1,Y) \\ = \sum_{S_1}\sum_{S_2}P(X_1|Y)P(X_2|X_1,Y) \left[ \log P(X_1|Y) + \log P(X_2|X_1,Y) \right] = \\ \sum_{S_1}\sum_{S_2}P(X_1|Y)P(X_2|X_1,Y)\log P(X_1|Y) \\ + \sum_{S_1}\sum_{S_2}P(X_1|Y)P(X_2|X_1,Y)\log P(X_2|X_1,Y) $$
where the step $(a)$ is Bayes rule. Now notice in the first sum, we can sum out $X_2$ and in the second sum we can sum out $X_1$ and arrive at this
$$\sum_{S_1}P(X_1|Y)\log P(X_1|Y) + \sum_{S_2}P(X_2|X_1,Y)\log P(X_2|X_1,Y) \\ = H(X_1|Y) + H(X_2|X_1,Y)$$