Definition of $\chi^2$-divergence between probability distributions

246 Views Asked by At

$ % Some definitions \def\D#1#2{\operatorname{D_{\chi^2}}(#1 \| #2)} \def\Df#1#2{\operatorname{D}_{f}(#1 \| #2)} \def\E#1#2{\operatorname*{\mathbb{E}_{#1}}\left[#2\right]} \def\dee{\mathop{\mathrm{d}\!}} \def\var#1{\operatorname{var}(#1)} $

I'm interested in the $\chi^2$-divergnce, as a way of comparing probability distributions. There are two ways I've seen it defined. Are they equivalent?

Let $P,Q$ be two probability measures on a set $\mathscr X$ with some sigma algebra, and $p,q$ be their densities with respect to some dominating measure denoted $\dee x$.

  1. The first definition I've seen is $$\D{P}{Q} \equiv \int_{\mathscr X}\frac{(p(x)-q(x))^2}{p(x)}\dee{x}$$ This definition makes the relationship with the Chi-squared test clear.
  2. The second definition I've seen is $$\D{P}{Q} \equiv \E{Q}{{\left(\frac{\dee{P}}{\dee{Q}}-1\right)}^2} = \int_{\mathscr X}{\left(\frac{\dee{P}}{\dee{Q}}-1\right)}^2\dee{Q}$$ where $\frac{\dee{P}}{\dee{Q}}$ is the RN-derivative of $P$ with respect to $Q$. This definition is used in the context of defining $\chi^2$ divergence as an $f$-divergence ($\Df{P}{Q}\equiv\E{Q}{f(\frac{\dee{P}}{\dee{Q}})}$ with $f(t)=(t -1)^2$).

My question is: are these definitions completely identical, or does the fact that the second one makes no mention of a dominating measure mean that it is more general?