Joint differential entropy of sum of random variables: $h(X,X+Y)=h(X,Y)$?

852 Views Asked by At

I see the following simplification used frequently in the literature, but I have not been able to verify it.

Let $X$ and $Y$ be absolutely continuous (i.e. they have pdfs) $\mathbb{R}^d$-valued random variables. Assume the joint variable $(X,X+Y)$ is absolutely continuous on $\mathbb{R}^{2d}$. Then $$h(X,X+Y)=h(X,Y).$$

Here $h$ signifies differential entropy, defined by $$h(W)=-\int_{\mathbb{R}^{d_W}}f_W(w)\log(f_W(w))\ dw$$ whenever $W$ is an $\mathbb{R}^{d_W}$-valued random variable with pdf $f_W$.

Note1: $X$ and $Y$ are not assumed to be independent.

Note2: Examples where the lhs is finite but the rhs is not defined would be accepted as a counterexample.

I am also wondering, if the statement can be proved, then is it more generally true that $$h(X,g(X,Y))=h(X,Y)$$ where $g$ is a deterministic function of its arguments?

This question is similar, but seems to concern Shannon entropy (i.e. discrete variables). Shannon Entropy and Differential Entropy have different sets of properties as discussed in these links answer1, answer2, question1,and question2.

2

There are 2 best solutions below

0
On BEST ANSWER

Use the fact that if $W$ has a pdf and $A$ is a linear transformation then \begin{align*} h(AW)&=h(W)+\log|\mathrm{det}A|. \end{align*}

In this case, let $W=[X \ \ Y]^T$, a vector valued variable in $\mathbb{R}^{2d}$. Let \begin{align} A=\left[\begin{array}{c c} I_d & 0 \\ I_d & I_d \end{array}\right], \end{align} where $I_d$ is the $d\times d$ identity block. Then $AW=[X \ \ X+Y]^T$. Therefore, \begin{align} h(X,X+Y)&=h(AW) \\ &= h(W)+\log|\mathrm{det}A| \\ &= h(X,Y) + \log(1) \\ &= h(X,Y) \end{align}

9
On

Another proof to that of @HaarD is the following.

Using the chain rule,

$$ \begin{align} h(X, X+Y) &= h(X) + h(X+Y\mid X)\\ &=h(X)+h(Y|X)\\ &=h(X,Y), \end{align} $$ where the first equality is application of the chain rule, the second equality holds because adding a constant to a random variable does not change its entropy, and the third equality is again by application of the chain rule.

This result does not generalize for arbitrary $g(X,Y)$.