Reference for the Gibbs variational principle/dual characterization of KL

1.3k Views Asked by At

Let $P,Q$ be two probability distributions. Then, one has the following dual characterization of their Kullback-Leibler divergence (relative entropy):

$$ D(P \,\|\,Q) = \sup_f ( \mathbb{E}_P[f(X)] - \log \mathbb{E}_Q[e^{f(X)}] ) \tag{1} $$ This characterization is sometimes referred to as Gibbs variational principle, or Donsker-Varadhan formula; however, I couldn't track where it was first proven (there is often a reference to a paper of and Donsker Varadhan from 1983, but I couldn't find, in there, where (1) is actually established).

Where was (1) established first, specifically? What to cite?