As correlation
$\rho_{X,Y} := \frac{Cov(X,Y)}{\sigma_X \sigma_Y}$
sort of measures the linear dependence of two random variables, and mutual information
$I(X; Y) := H(X) - H(X|Y)$
measures the general dependence of two random variables, I feel like it should be possible to get an upper bound on the correlation in terms of the mutual information.
I expect that as the mutual information increases, correlation tends to 1, and as mutual information tends to 0, correlation also does.
Can anyone help me formalise this?
For the mutual information, it can be useful to consider the conditional entropy instead: $$H(X|Y) = H(X,Y) - H(X)$$
However, the claim in the question is incorrect, because correlation indicates only linear dependence, while mutual information relates to dependence in general.
Going back one step to covariance, we can find the following example: