Help with Variational Calculus & Leibnitz Rule

32 Views Asked by At

Let $f: (0,\infty)\to \mathbb{R}$ be convex and lower-semicontinuous with $f(1)=0$ and $\mu$, $\hat{\mu}$ be two probability distributions on a measurable space $\mathcal{X}$ which are absolutely continuous w.r.t. the Lebesgue measure $\lambda$. I want to proof that $$ \sup_{T:\mathcal{X}\to dom(f)}\int_{\mathcal{X}}T(x)\frac{d\mu}{d\lambda}(x)-f(T(x))\frac{d\hat{\mu}}{d\lambda}(x)d\lambda(x) $$ attains its supremum at $T^*(x)=(f')^{-1}(\frac{d\mu}{d\hat{\mu}}(x))$. My first idea was to define a functional $H: \mathcal{X}\times \mathcal{T}\to \mathbb{R}, (x,T(x))\mapsto T(x)\frac{d\mu}{d\lambda}(x)-f(T(x))\frac{d\hat{\mu}}{d\lambda}(x)$ and search for the mapping $T$ such that $$ \frac{d}{dT}\int_\mathcal{X}H(x,T(x))d\lambda = 0. $$ This could be easily done if the integral and the gradient can be switched so I only need to compute the gradient of $H(x,T(x))$ with respect to the mapping $T$ (Leibnitz Rule), but I am not sure if this can be applied in this variational context (or which assumptions are needed, e.g. for $T$, $f$ or $\mathcal{X}$).