Conditional Probability and KL Divergence

307 Views Asked by At

Let $\mathbb{P}_A$ denote the distribution conditioned on $A$ such that for any measurable set $C$, we have $\mathbb{P}_A(C)=\mathbb{P}(A\cap C)/\mathbb{P}(A)$. On page 80 of the citation, author claimed that the KL divergence $D(\mathbb{P}_A||\mathbb{P})=\log\frac{1}{\mathbb{P}(A)}$. I don'k know how to derive such an equality and I found there exists a term with respect to $\mathbb{P}(C)$ and $\mathbb{P}(C\cap A)$ that cannot be canceled out.

Wainwright, Martin J., High-dimensional statistics. A non-asymptotic viewpoint, ZBL07021501.