The differential Renyi entropy for a probability distribution is given by $H_q(P(X))=\frac{1}{1-q}\log\int p^q(x)dx$. In the limit of $q\to 1$, it reduces to the usual Shannon entropy. We can write down the mutual information between two variables X and Y simply by $I(X;Y)=H_q(P(X))+H_q(P(Y))-H_q(P(X,Y))$. Is this always a non-negative quantity? Again, in the case $q=1$ it is very easy to show it, but what about in general?
2026-04-06 09:53:49.1775469229
Positivity of Renyi Mutual Information
2.4k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in INFORMATION-THEORY
- KL divergence between two multivariate Bernoulli distribution
- convexity of mutual information-like function
- Maximizing a mutual information w.r.t. (i.i.d.) variation of the channel.
- Probability of a block error of the (N, K) Hamming code used for a binary symmetric channel.
- Kac Lemma for Ergodic Stationary Process
- Encryption with $|K| = |P| = |C| = 1$ is perfectly secure?
- How to maximise the difference between entropy and expected length of an Huffman code?
- Number of codes with max codeword length over an alphabet
- Aggregating information and bayesian information
- Compactness of the Gaussian random variable distribution as a statistical manifold?
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
EDIT. I justify the positivity of the Renyi mutual information using its interpretation as Renyi divergence. I follow the expositions in
T. Cover, J. A. Thomas "Elements of Information Theory" (chapter 2)
and
D. Xu, D. Erdogmuns "Renyi's Entropy, Divergence and their Nonparametric Estimators"
In the setting of "classical" information theory the mutual information $I(X,Y)$ of the random variables $X$ and $Y$ is defined as
$$I(X,Y):=D_{KL}(p_{XY}||p_Xq_Y),$$
where $D_{KL}(p_{XY}||p_Xq_Y),$ denotes the Kullback Leibler divergence (KL divergence) between the joint probability $p_{XY}$ and the product $p_Xq_Y$ of the prob. distribution of $X$ and $Y$.
Using the Jensen inequality on the KL divergence it follows that $I(X,Y)$ is always non negative. I refer to the first reference for the computation in the discrete case.
Introducing the Shannon entropies $H(X)$ $H(Y)$ of $X$ resp. $Y$ and the conditional entropy $H(X|Y)$ we arrive at the equivalent formulation
$$I(X,Y)=H(X)+H(Y)-H(X|Y).$$
Let us consider the Renyi $\alpha$-setting , now. With
$$H_{\alpha}(X)=\frac{1}{1-\alpha}\log\int p^{\alpha}_X(x)dx$$
we denote the Renyi entropy of the r.v. $X$. The Renyi divergence of the distribution $g(x)$ from the distribution $f(x)$ is
$$D_{\alpha}(f||g):=\frac{1}{\alpha-1}\log\int f(x)\left(\frac{f(x)}{g(x)}\right)^{\alpha-1}dx.$$
It can be proved that (please see the second reference at pag.81)
$$D_{\alpha}(f||g)\geq 0 \forall ~f, g, \text{and}~\alpha>0,~~(*)$$ $$\lim_{\alpha\rightarrow 1}D_{\alpha}(f||g)=D_{1}(f||g)=D_{KL}(f||g).~~(*)$$
The Renyi mutual information $I_{\alpha}(X,Y)$ is defined naturally as the Renyi divergence between the joint distribution $p_{XY}$ of $X$ and $Y$ and the product of the marginal distributions $p_X$, $q_Y$, i.e.
$$I_{\alpha}(X,Y):=D_{\alpha}(p_{XY}||p_Xq_Y).$$
This is a definition; you can find it, for example, at pag. 83 in the second reference. You can justify it through the overall $\alpha$-setting and the limit
$$\lim_{\alpha\rightarrow 1}I_{\alpha}(X,Y)=I(X,Y),$$
which follows from property $(**)$ of the Renyi divergence. This limit is parallel to the fundamental $\lim_{\alpha\rightarrow 1}H_{\alpha}(X)=H(X):$
From property $(*)$ one derives nonnegativity of the Renyi mutual information.
For these reasons, I would prove non negativity of the Renyi mutual information through the above definition. At the present stage I haven't been able to prove that
$$I_{\alpha}(X,Y)=H_{\alpha}(X)+H_{\alpha}(Y)-H_{\alpha}(X|Y),$$
or to find such characterization in the literature. Even in the discrete case I got blocked because of the coefficient $\frac{1}{1-\alpha}$ in front of the entropies. The cases $0<\alpha<1$ and $\alpha>1$ must be studied separately and it seems that a straightforward application of Jensen's inequality is not possible.