If two random variables $X, Y$ have high mutual information $I(X;Y)$, intuitively does that mean $X,Y$ have almost deterministic relationships, say $Y=f(X)+\epsilon$ where $\epsilon$ is a noise random variable? If this is true, can we prove this relationship?
Thanks!
Your question is rather vague ("high" relative to what? keeping fixed what and varying what?), but...
$I(X;Y)=H(X)-H(X \mid Y)$
This says that a bound for the mutual information is given by $I(X;Y)\le H(X)$. With relation to this bound, then, assuming $H(X)$ fixed, $I(X;Y)$ is maximized when $I(X;Y)=H(X)$. This happens when $H(X\mid Y)=0$ which means that $X$ depends deterministically on $Y$: $X=g(Y)$. Of course, the same could be say reversing the variables (but both things are not equivalent).
If you think of the particular model $Y=X+Z$ where $Z$ is independent of $X,Y$, and further, assume that $X,Z$ are gaussian, the $I(X;Y)$ grows as variance of $Z$ decreases, and the maximum is obtained when $Z=k$ (constant, not necesarily zero).