I'm a biologist trying to apply the Mutual Information (MI) to some RNA secondary structure. I know that there exists two MI equation that, mathematically, are equal:
$I(X,Y) = \sum_{x,y} p(x,y) log_{2}\frac{p(x,y)}{p(x) p(y)}$ (1)
$I(X,Y) = H(X)+H(Y)-H(X,Y)$ (2)
And I know that
$H(X) = -\sum_{x}p(x)log_{2}p(x)$ (3)
$H(X,Y) = -\sum_{x} \sum_{y}p(x,y)log_{2}p(x,y)$ (4)
But what I don't know is the process which (1) can be (2) and vice-versa. Can someone, please, teach me?
Thanks in advance! =)
One key step is to note that $$\sum_y p(x,y)=p(x)$$ and $$\sum_x p(x,y)=p(y)$$ (law of total probability).
Another key step is $$\log_2 \frac{p(x,y)}{p(x)p(y)} = \log_2 p(x,y) - \log_2 p(x) - \log_2 p(y).$$
The equality then follows immediately: \begin{align*} &\phantom{{}={}}H(X)+H(Y)-H(X,Y)\\ &= -\sum_x p(x) \log_2 p(x) - \sum_y p(y) \log_2 p(y) + \sum_x \sum_y p(x,y) \log_2 p(x,y)\\ &= -\sum_x \sum_y p(x,y) \log_2 p(x) - \sum_x \sum_y p(x,y) \log_2 p(y) + \sum_x \sum_y p(x,y) \log_2 p(x,y)\\ &= \sum_x \sum_y p(x,y) \log_2 \frac{p(x,y)}{p(x)p(y)} \end{align*}