Monotonicity of the divergence

46 Views Asked by At

Let $P, Q$ be probability measures on a measurable space $(\mathcal{X}, \mathcal{F})$, and suppose $P$ is absolutely continuous with respect to $Q$. For any sub-$\sigma$-algebra $\mathcal{G} \subset \mathcal{F}$, we define $$ D_{\mathcal G}(P \| Q) := \sum P(A) \log \frac{P(A)}{Q(A)}. $$ The sum above ranges over atoms $A \in \mathcal{G}$. I saw it claimed that $D_{\mathcal{H}}(P\|Q) \leq D_{\mathcal{G}}(P \|Q)$ provided $\mathcal{H} \subset \mathcal{G}$ is a sub-$\sigma$-algebra.

I can see this is true if for each atom $A$ of $\mathcal{H}$ there exists a collection $B_1, \dots, B_m$ in the atoms of $\mathcal{G}$ such that $A = \cup_i B_i$, and by the convexity of the function $t \mapsto t \log t$ (the so-called log-sum inequalities). In some simple examples this condition is true, but I am not sure if this true in general. I would be interested in any ideas people have to help establish this claim as I am struggling to prove it.