In the following Wikipedia link, it is stated that for any positive semidefinite $\omega$ with $\text{Tr}(\omega) = 1$ and self-adjoint $H$, it holds that
\begin{align} \text{Tr}(\omega H) -\text{Tr}(\omega \log \omega) \leq \log \text{Tr}(\exp{H}) \end{align} with equality if and only if $\omega=\frac{\exp{H}}{\text{Tr}(\exp{H})}$.
To prove, this statement, let us denote the left hand side of the inequality as $f(\omega)$. Using the rule that $\frac{\partial}{\partial X} \text{Tr}(XA) = A^T$, we can differentiate $f(\omega)$. Setting it to zero, we obtain that \begin{align} H - I - \log(\omega) = 0 \end{align}
This yields $\omega = \exp(H - I)$, which is almost correct except for normalization. However, I am not sure how to enforce $\text{Tr}(\omega) = 1$. My intial attempt was to write this constraint along with Lagranage multipliers to get
\begin{align} F(\omega, \lambda) = f(\omega) + \lambda(\text{Tr}(\omega) - 1) \end{align}
and now set $\frac{\partial F}{\partial \lambda} = 0$ and $\frac{\partial F}{\partial \omega} = 0$ but this didn't really work.
How can I use Lagrange multipliers to enforce the constraint $\text{Tr}(\omega) = 1$ in the proof of the Gibbs variational principle? In general, can any linear constraint be dealt with in the same way?
We seek $\max(f(w))$ when $w$ is symmetric $>0$ and under the condition $tr(w)=1$.
The Lagrange's codition is: if $f$ reaches its maximum in $w_0$, then
there is $\lambda$ s.t., for every $k$ symmetric, $Df_{w_0}(k)+\lambda tr(k)=0$, that is,
for every $k$ symmetric, $tr(Hk)-tr(k\log(w_0))-tr(w_0w_0^{-1}k)+\lambda tr(k)=0$.
Thus $H-\log(w_0)+(\lambda-1)I$ is skew symmetric; since the previous expression is also symmetric,
$(*)$ $H-\log(w_0)+(\lambda-1)I=0$.
Thus $\exp(\log(w_0))=w_0=e^{\lambda-1}\exp(H)$; consequently, $tr(w_0)=1=e^{\lambda-1}tr(\exp(H))$ and $e^{1-\lambda}=tr(\exp(H))$.
According to $(*)$, $w_0H-w_0\log(w_0)=(1-\lambda)w_0$ and the required maximum is
$f(w_0)=(1-\lambda)=\log(tr(\exp(H)))$.
Moreover, this maximum is reached only in
$w_0=e^{\lambda-1}\exp(H)=\dfrac{1}{tr(\exp(H))}\exp(H)$.