Derivative of $\text{Tr}(A B^{1/2})$ w.r.t. matrix $B$

291 Views Asked by At

Similar but more complicated questions have been asked, but I am simply interested in the gradient:$$\nabla_B \text{Tr}(A B^{1/2})$$ for positive-definite, symmetric $B$ ($A$ can be also assumed to be square, symmetric, positive-definite, if it leads to a computable derivative). Here $(\cdot)^{1/2}$ denotes the matrix square-root.

Is there a simple formula for this?

Answers to the existing questions look complicated and I am not sure if there's a closed form expression clearly in terms of $B$ so one can actually code it up, not only theoretically find the expression. And perform things like gradient descent.

Thanks.

PS: The specific gradient I am interested in looks like this: $$\nabla_B \text{Tr}\left((A^{1/2} B A^{1/2})^{1/2}\right).$$ I asked above for simplicity, but maybe this specific structure makes things easier.

1

There are 1 best solutions below

4
On BEST ANSWER

For typing convenience, define $$X = A^{1/2}BA^{1/2}$$ Write the objective function in terms of this new variable. Then find its differential and gradient. $$\eqalign{ \phi &= {\rm Tr}(X^{1/2}) \cr d\phi &= \tfrac{1}{2}(X^{-1/2})^T:dX \cr &= \tfrac{1}{2}(X^{-1/2})^T:A^{1/2}dB\,A^{1/2} \cr &= \tfrac{1}{2}(A^{1/2}X^{-1/2}A^{1/2})^T:dB \cr \frac{\partial \phi}{\partial B} &= \frac{1}{2}(A^{1/2}X^{-1/2}A^{1/2})^T \cr }$$ Since $(A,B)$ are both symmetric, the gradient can be simplified $$\eqalign{ \frac{\partial \phi}{\partial B} &= \frac{1}{2}A^{1/2}X^{-1/2}A^{1/2} \cr }$$ The colon $(:)$ is a convenient inline product notation for the trace, i.e. $$A:B = {\rm Tr}(A^TB)$$ The properties of the trace allow terms in this product to be rearranged in many ways $$\eqalign{ A:BC &= B^TA:C &= AC^T:B \cr A:B &= A^T:B^T &= B:A \cr }$$