Estimating mutual information from local densities

21 Views Asked by At

Consider the two multivariate random variables $X$ and $Y$. Their mutual information (MI) is defined as $$\begin{aligned} I(X;Y) &\stackrel{\mathrm{def}}{=} D_\mathrm{KL}(P_{(X,Y)}||P_{X}\otimes P_{Y})\,, \\ &= \int_\mathcal{X}\int_\mathcal{Y} p_{(X,Y)}(x, y) \log \frac{p_{(X,Y)}(x, y)}{p_{X}(x)p_{Y}(y)} \mathrm{d}x\mathrm{d}y\,. \\ \end{aligned}$$

What is available to estimate this are points $\{(x_i, y_i)\}_{i=1}^{N}$ drawn from $(X, Y)$. Locally around these points the covariance matrices $\Sigma_{(x_i, y_i)}$, $\Sigma_{x_i|y_y}$ and $\Sigma_{y_i|x_i}$ are known (calculated as $\mathrm{E}_\Xi((\xi - \xi_i)(\xi - \xi_i)^\mathrm{T})$). The distributions are assumed to be continuous around $(x_i, y_i)$ and as such we can approximate the densities as $$ p(\xi) = \frac{1}{|\Sigma_\xi|^{\frac{1}{2}}} $$ and the MI locally is $$ \mathrm{LMI}_{X, Y}(x_i, y_i)\stackrel{\mathrm{def}}{=} I(X;Y|x\in B_\varepsilon(x_i), y\in B_\varepsilon(y_i)) = \frac{1}{2}\log\frac{|\Sigma_{x_i|y_i}||\Sigma_{y_i|x_i}|}{|\Sigma_{(x_i,y_i})|}\,, $$ where $B_\varepsilon(\xi)$ is the local neighborhood around $\xi$.

Naively what I'd like to do is to average these local estimates to estimate $I(X, Y)$. However, $$\begin{aligned} \frac{1}{N} \sum_i \mathrm{LMI}_{X, Y}(x_i, y_i) &\approx \int_\mathcal{X}\int_\mathcal{Y} p_{(X,Y)}(x, y) \log \frac{p_{(X,Y)}(x, y)}{p_{X|Y}(x|y)p_{Y|X}(y|x)} \mathrm{d}x\mathrm{d}y\,, \\ &\stackrel{?}{\neq} \int_\mathcal{X}\int_\mathcal{Y} p_{(X,Y)}(x, y) \log \frac{p_{(X,Y)}(x, y)}{p_{X}(x)p_{Y}(y)} \mathrm{d}x\mathrm{d}y\,. \end{aligned}$$

Is there a way this extra information about the local densities can be leveraged and still get an unbiased estimate for the MI?