KL divergence for distribution representing sums of iid random variables

255 Views Asked by Bumbble Comm At 28 Mar 2026 - 1:28

Sorry if my description is inaccurate, I hope it's understandable.

Given $X_1,...,X_n$, a series of $n$ iid Bernoulli RVs with means $p$, and a similar series $Y_1,...,Y_n$ with means $q$, we know that the KL divergence between the probability measure $P$ corresponding to the sum $X=X_1+...+X_n$ and $Q$ that corresponds to $Y=Y_1+...+Y_n$ is (since $X$ and $Y$ are binomial) $$KL(P,Q)=nd(p,q),$$ where $d(p,q)$ stands for the KL divergence between measures of Bernoulli RVs with means $p$ and $q$. A similar principle also holds for the divergence between measures corresponding to sums of independent Gaussians.

My question is whether we can claim this holds in general. Meaning, given i.i.d $X_1,...,X_n$ such that each RV has a measure $P_x$ and similarly for $Y_1,...,Y_n$ and $Q_y$, define $X=X_1+...+X_n$ and $Y=Y_1+...+Y_n$ where measure $P$ corresponds to $X$ and $Q$ to $Y$. Can we say that $$KL(P,Q)=nKL(P_x,Q_y)?$$ Thank you in advance!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 19 Mar 2023 - 1:59 BEST ANSWER

This is correct if $P$ and $Q$ belong to the same exponential family: this is indeed the case for your example. To see this, consider the exponential family generated by the measure $\mu$ on $R$, namely $P_{\theta}(dx)=e^{\theta x-k(\theta)}\mu(dx).$ Then the $n$ convolution is $$P^{*n}_{\theta}(dx)=e^{\theta x-nk(\theta)}\mu^{*n}(dx).$$ Then $$D(P^{*n}_{\theta_1}||P^{*n}_{\theta_2})=\int[(\theta_1-\theta_2)x-n(k(\theta_1)-k(\theta_2)]P^{*n}_{\theta_1}(dx)=n[(\theta_1-\theta_2)k'(\theta_1)-(k(\theta_1)-k(\theta_2)].$$

Bumbble Comm On 19 Mar 2023 - 12:40

I think I found a counter-example, please correct me if I'm wrong. For $X_1,X_2\sim \text{Uniform}(0,0.5)$ and $Y_1,Y_2\sim \text{Uniform}(0,1)$ we know that (using the notations in the last paragraph of my question above) $$D(P_x,Q_y)=log(2)~.$$ In this case $X,Y$ are triangularly distributed and so a direct calculation gives $$D(P,Q)=\int_{0}^{0.5}4xlog(\frac{4x}{x})dx+\int_{0.5}^{1}(4-4x)log(\frac{4-4x}{x})dx=1 \neq 2log2~.$$

KL divergence for distribution representing sums of iid random variables

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in INFORMATION-THEORY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions