Let X be a fair coin flip and Y a coin flip with $p = \frac{1}{3}$. I want to find possible values of mutual information $I(X:Y)$ between them.
If they are independent then this value is $0$. But when i try to find an upper bound on the only way i see: writing down definitions, I meet unhandable mess of $log$s.
I have rough bound $min(H(X), H(Y))$ from $I(X,Y)=H(X)-H(X|Y)$ and $I(X,Y)=H(Y) - H(Y|X)$ but it is clearly not reached.
The first step is to figure out all possible joint distributions of $(X,Y)$. In this case there happens to be a very simple one-parameter family of possibilities. For concreteness let's say that $P(X=0) = P(X=1) = 1/2$ and $P(Y=0) = 2/3$, $P(Y=1) = 1/3$. In the joint distribution, let's say that $P(X=0,Y=0) = t$. This forces us to have $P(X=1,Y=0) = 2/3-t$ and $P(X=0,Y=1) = 1/2 - t$ and $P(X=1,Y=1) = t-1/6$. Also, each of these four numbers needs to be in $[0,1]$, which is only possible if $t \in [1/6, 1/2]$. Thus we have a complete characterization of all possible joint distributions.
Now the calculation of $I(X,Y)$ is a simple calculation: \begin{align*} I(X,Y) &= H(X) + H(Y) - H(X,Y) \\ &= \log2 - (2/3)\log(2/3) - (1/3)\log(1/3) + t\log t + (1/2-t)\log(1/2-t) \\ &= \ \ \ \ \ \ \ \ + (2/3-t)\log(2/3-t) + (t-1/6)\log(t-1/6). \end{align*}
To see all the possible values this can take, you can just graph this on the interval $t \in [1/6, 1/2]$:
The zero is at $t=1/3$ which corresponds to $X$ and $Y$ being independent. You can also figure out the maximum possible by plugging in $t=1/6$ or $t=1/2$ (with the usual convention that $0\log 0 = 0$).
EDIT: In hindsight, it should have been obvious that $I(X,Y)$ is maximized when $X$ a.s. determines $Y$ (corresponding to the edge cases $t=1/6$ or $t=1/2$). So if all you care about is the maximum possible mutual information, the above is a lot of unnecessary work.