I am reading the article Pricing via utility maximization and entropy by Richard Rouge and Nicole El Karoui. They talk about the relative entropy of a probability measure $Q$ with respect to the probability measure $P$ defined by $h(Q \vert P) := E[dQ/dP \ln(dQ/dP)]$ if $Q \ll P$, $+\infty$ else. They also talk about the concept of free energy of a random variable $B$, and this is equal to $\ln E[\exp B]$. They claim that for a bounded random variable $B$, entropy and free energy are in relation by the following duality:
\begin{align} \ln E[\exp B] = \sup_{Q \ll P} [ E^{Q}[B] - h(Q \vert P) ] \end{align}
Does anyone know an article that shows the proof? or does anyone have any idea about how to deduce this equation? It may sound trivial, but I haven't seen a supremum in this way.
The interesting feature is that with this formula we can deduce the stochastic game between an agent and the market.
This is a result about the convex conjugate of the map $Q \mapsto h(Q \mid P)$.
Here is a proof for discrete random variables (note that $P$ and $Q$ are defined opposite to yours), which I believe can be generalized.