Relationship between cross entropy and DIFFERENTIAL cross entropy

60 Views Asked by At

Short question:

The relationship between entropy and differential entropy is widely known. However, is there any relationship between cross entropy and differential cross entropy?

Long question:

(Background)

Consider a continuous random variable $X_{1}$ with density $p_{X_{1}}(x)$ and differential entropy $h(X_{1})$. Lets quantize $X_{1}$ with bins of length $\Delta$. Then, the corresponding discrete random variable is $X_{1}^{\Delta}$. Since there exists $x_{i}\in [i\Delta, (i+1)\Delta]$ satisfying \begin{equation} p_{X_{1}}(x_{i})\Delta=\int_{i\Delta}^{(i+1)\Delta}p_{X_{1}}(x)dx \end{equation} by the mean value theorem for each bin, the entropy of $X_{1}^{\Delta}$ is \begin{align} H(X_{1}^{\Delta})&=-\sum_{i=-\infty}^{\infty}p_{X_{1}}(x_{i})\Delta\log(p_{X_{1}}(x_{i})\Delta)\\ &=-\sum_{i=-\infty}^{\infty}p_{X_{1}}(x_{i})\Delta\log(p_{X_{1}}(x_{i}))-\log(\Delta), \end{align} which implies that $\lim_{\Delta\to 0}(H(X_{1}^{\Delta})+\log(\Delta))=h(X_{1})$.

Again, consider a continuous random variable $X_{2}$ with density $p_{X_{2}}(x)$ and differential entropy $h(X_{2})$. Lets quantize $X_{2}$ with bins of length $\Delta$. Then, the corresponding discrete random variable is $X_{2}^{\Delta}$. Since there exists $x_{i}'\in [i\Delta, (i+1)\Delta]$ satisfying \begin{equation} p_{X_{2}}(x_{i}')\Delta=\int_{i\Delta}^{(i+1)\Delta}p_{X_{2}}(x)dx \end{equation} by the mean value theorem for each bin, the entropy of $X_{2}^{\Delta}$ is \begin{align} H(X_{2}^{\Delta})&=-\sum_{i=-\infty}^{\infty}p_{X_{2}}(x_{i}')\Delta\log(p_{X_{2}}(x_{i}')\Delta)\\ &=-\sum_{i=-\infty}^{\infty}p_{X_{2}}(x_{i}')\Delta\log(p_{X_{2}}(x_{i}'))-\log(\Delta), \end{align} which implies that $\lim_{\Delta\to 0}(H(X_{2}^{\Delta})+\log(\Delta))=h(X_{2})$.

(Main question)

The cross entropy between $X_{1}^{\Delta}$ and $X_{2}^{\Delta}$ is \begin{align} H(X_{1}^{\Delta}, X_{2}^{\Delta})&=-\sum_{i=-\infty}^{\infty}p_{X_{1}}(x_{i})\Delta\log(p_{X_{2}}(x_{i}')\Delta)\\ &=-\sum_{i=-\infty}^{\infty}p_{X_{1}}(x_{i})\Delta\log(p_{X_{2}}(x_{i}'))-\log(\Delta). \end{align} However, the first term is not a Riemann sum because $x_{i}\neq x_{i}'$, so we cannot argue that the first term converges to $h(X_{1}, X_{2})$. Is there any way to express the first term as a Riemann sum so that the first term converges to $h(X_{1}, X_{2})$? Or, does the first term even converge to $h(X_{1}, X_{2})$? Intuitively, I think that the first term converges to $h(X_{1}, X_{2})$, but cannot prove it because the first term is not a Riemann sum. When I saw this problem, I thought if there exists $x_{i}^{*}\in [i\Delta, (i+1)\Delta]$ satisfying \begin{equation} p_{X_{1}}(x_{i}^{*})\log(p_{X_{2}}(x_{i}^{*}))=p_{X_{1}}(x_{i})\log(p_{X_{2}}(x_{i}')) \end{equation} for each bin, everything is resolved because the first term can be expressed as a Riemann sum. However, I do not know whether the existence of such $x_{i}^{*}$ is always guaranteed.

By the way, the distributions I specifically consider are Gaussian distributions with the same variances but difference means, namely $X_{1}\sim\mathcal{N}(\mu_{1}, \sigma^{2})$ and $X_{2}\sim\mathcal{N}(\mu_{2}, \sigma^{2})$.