I am learning convex analysis on my own and I would appreciate some help.
I know that convex conjugate is the generalization of the Legendre transform. I also know the formula known for Legendre transform (as stated on Wikipedia).
However, I am reading a paper on which I can not understand how Legendre transform is calculated. There is a section called Fact II.2 that claims the dual of the quantum relative entropy (it's Legendre transform) holds. Could you explain how these two expressions in Fact II.2 are equivalent? How can I calculate similar Legendre transforms?
Edit: More specifically, I don't understand why the following statements are dual in the mentioned paper:
$$ D(\rho\,\|\,\sigma) = \sup_{w \in P_{\geq}(A)}\Big\{\operatorname{tr} \rho \log w - \log\operatorname{tr}\exp(\log w + \log \sigma)\Big\} $$ and $$ \log\operatorname{tr}\exp(H + \log \sigma) = \sup_{w \in S(A)}\Big\{\operatorname{tr} H w - D(w\,\|\,\sigma)\Big\} $$
[55]: D. Sutter, M. Berta, and M. Tomamichel. Multivariate trace inequalities. Communications in Mathematical Physics, 352(1):37–58, 2017. DOI: 10.1007/s00220-016-2778-5.
The the first statement is basically the quantum version of the Donsker-Varadhan formula and is obtained using the Legendre transform. However, the second statement can be calculated directly from the first. For an arbitrary density matrix (positive semi-definite operator of trace one) $ \rho $ we get the following from the first statement:
$$ tr(H \rho) - D( \rho \| \sigma) \leq tr(H\rho) - tr(\rho log w) + log tr (exp(log w + log \sigma)) $$ Now if we substitute $w = e^{H}$ we get:
$$ tr(H\rho) - D(\rho \| \sigma) \leq log tr (exp(H + log \sigma) $$ which results the second statement.