Bound for type of correlation measure

36 Views Asked by At

Assume you have a finite, discrete probability distribution for a joint random variable and such that $P(X=i,Y=j) = p_{i,j}$ for $i \in \{1, \dots, |X|\},j \in \{1, \dots, |Y|\}$. The marginal distributions are given by $Prob(X=i) = p_i = \sum_{j=1}^{|Y|} p_{i,j}$ and similarly for the other marginal.

I would like to get a "good" upper bound in terms of the mutual information for the following expression: $$ |\sum_{i,j} (p_{i,j} - p_i p_j) \log(p_i) \log(p_j)|. $$

Now, I can do $$ |\sum_{i,j} (p_{i,j} - p_i p_j) \log(p_i) \log(p_j)| \\ \leq \sum_{i,j} | p_{i,j} - p_i p_j|| \log(p_i)|| \log(p_j)| \\ \leq \sum_{i,j} | p_{i,j} - p_i p_j |\max_{i,j}|\log(p_i)|| \log(p_j)| \\ = ||P_{XY} - P_{X}P_{Y}||_1|\log( \min_i p_i)|| \log( \min_j p_j)| \\ \leq \sqrt{2I(X:Y)}|\log( \min_i p_i)|| \log( \min_j p_j)| $$ where $I(X:Y)$ denotes the mutual information and where I used the triangle inequality and Pinsker's inequality. Note that I can assume without loss of generality that $p_{i},p_j > 0$ for all $i,j$ since zero terms simply disappear from the original sum (taking $0\log0=0$).

However, this bound is not good enough for my purposes. I need a bound that cannot be made arbitrarily large simply by decreasing the smallest (non-positive) marginal probability. Instead, I'm looking for a bound of $O(I(X:Y)\log(|X||Y|)).$