Derivative of $A^x$ w.r.t $x$ inside Trace Operation

118 Views Asked by At

In my reference Page 520, Entropy and information, Quantum Computation and Quantum Information by Nielsen and Chuang, it is given that


The relative entropy $S(ρ||σ)$ is jointly convex in its arguments, where $S(ρ||σ)=tr(\rho\log\rho)-tr(\rho\log\sigma)$ and $\rho,\sigma$ are positive semidefinite with trace $1$.

Proof:

For arbitrary matrices $A$ and $X$ acting on the same space define $$ I_t(A,X)\equiv tr(X^\dagger A^tXA^{1-t})-tr(X^\dagger XA)\tag{11.98} $$ The first term in this expression is concave in $A$, by Lieb’s theorem, and the second term is linear in $A$. Thus, $I_t(A, X)$ is concave in $A$. Define $$ \color{red}{ \begin{align} I(A,X)&=\frac{d}{dt}\bigg|_{t=0}I_t(A,X)\\ &=tr(X^\dagger (\log A)XA)-tr(X^\dagger X(\log A)A) \end{align}}\tag{11.99} $$ Noting that $I_0(A,X)=0$ and using the concavity of $I_t(A,X)$ in $A$ we have \begin{align} I(\lambda A_1+(1-\lambda)A_2,X)&=\lim_{\Delta\to 0}\frac{I_\Delta(\lambda A_1+(1-\lambda)A_2,X)}{\Delta}\tag{11.100}\\ &\ge \lambda\lim_{\Delta\to 0}\frac{I_\Delta(A_1,X)}{\Delta}+\lim_{\Delta\to 0}\frac{I_\Delta(A_2,X)}{\Delta}\tag{11.101}\\ &=\lambda I(A_1,X)+(1-\lambda)I(A_2,X)\tag{11.102} \end{align} That is, $I(A,X)$ is a concave function of $A$. Defining the block matrices \begin{align} A\equiv\begin{bmatrix}\rho&0\\0&\sigma\end{bmatrix}\quad,\quad X\equiv\begin{bmatrix}0&0\\I&0\end{bmatrix}\tag{11.103} \end{align} we can easily verify that $I(A,X)=−S(ρ||σ)$. The joint convexity of $S(ρ||σ)$ follows from the concavity of $I(A,X)$ in $A$.


How do we reach $I(A,X)=\frac{d}{dt}\bigg|_{t=0}I_t(A,X)=tr(X^\dagger (\log A)XA)-tr(X^\dagger X(\log A)A)$ ?, where $I_t(A,X)\equiv tr(X^\dagger A^tXA^{1-t})-tr(X^\dagger XA)$

I think I am confused by the fact that we are taking the derivative of $A^t$ with respect to $t$ inside trace operation.

It looks like $\frac{d}{dt}A^t=\log(A)A^t$, why is that the case ?

It'd be helpful if someone could direct me in the right direction in order to understand the whole proof.

Cross Posted on QC Stack Exchange