Derivative of Square Root of Matrix with respect to a Scalar

535 Views Asked by At

Let $X(\Omega)$ be a positive-semi-definite matrix which is a function of a set of parameters $\Omega$. I am interested in both cases where the matrix is real, or is Hermitian.

What is the derivative of the square root of this matrix with respect to an individual parameter $\Omega_i$, i.e $ {\partial_{\Omega_i}\sqrt{X(\Omega)}} $ ? Can this derivative be reduced to a form in terms of ${\partial_{\Omega_i}X(\Omega)}$?

3

There are 3 best solutions below

0
On BEST ANSWER

For typing convenience define the matrices $$ S=\sqrt{X},\quad \dot S=\frac{dS}{d\Omega_i},\quad \dot X=\frac{dX}{d\Omega_i},\quad M=\left(I\otimes S+S^T\otimes I\right)^{-\tt1} $$

Utilizing the vec operation one can proceed as follows. $$\eqalign{ SS &= X \\ S\dot S + \dot SS &= {\dot X} \\ (I\otimes S+S^T\otimes I)\operatorname{vec}(\dot S) &= \operatorname{vec}({\dot X}) \\ \operatorname{vec}(\dot S) &= M\operatorname{vec}({\dot X}) \\ \dot S &= \operatorname{reshape}\left(M\operatorname{vec}\big({\dot X}\big),\; {\rm size}\big(S\big)\right) \\ }$$ If $M$ does not exist, then there is no solution but it might be possible to use the Moore-Penrose pseudoinverse to obtain a least-squares solution.

5
On

You can use the Dunford-Taylor-Cauchy integral formula to define the square root of a matrix:

$$ \sqrt{X} = \frac{1}{2\pi i } \oint_\Gamma \sqrt{z} \frac{dz}{z-X} $$

where $\Gamma$ is a closed curve that encircles all the eigenvalues of $X$ in anticlockwise direction. This curve can be taken far away from the eigenvalues such that it is un-affected by the perturbation (when computing the derivative).

Furthermore use

$$ \frac{d}{dt} \frac{1}{z-X} = \frac{1}{z-X} X' \frac{1}{z-X}, $$

(prime indicates differentiation wrt to $t$). All in all we get

$$ \frac{d}{dt} \sqrt{X} = \frac{1}{2\pi i } \oint_\Gamma \sqrt{z} dz \frac{1}{z-X} X' \frac{1}{z-X}.\ \ \ \ \ (1) $$

A convenient expression can be obtained going to the spectral representation of $X$:

$$ X = \sum_n \lambda_n P_n \ \ \ \ \ (2) $$

with $\lambda_n, P_n$ respectively eigenvalues, eigenprojectors. Plugging it into (1) and evaluating the residues we get

\begin{align} \frac{d}{dt} \sqrt{X} &= \sum_n \frac{1}{2\sqrt{\lambda_n}} P_n X' P_n \\ & + \sum_{n\neq m} \frac{\sqrt{\lambda_n}-\sqrt{\lambda_m} }{\lambda_n - \lambda_m} P_n X' P_m \ \ \ (3) \end{align}

Apparently Eq. (3) is not valid if one of the eigenvalues is zero, pretty much as in @greg's answer. However, looking carefully at the residues one realizes that if there is a $\lambda_{n'}=0$ term that residue is zero. In other words, simply remove $n'$ from the first sum in (3).

With these tweaks Eq. (3) is valid in full generality.

0
On

There are two explicit forms of the required derivative.

i) We use the greg's method, that reduces to solving (in $S'$)

$SS'+S'S=X'$. There is $P\in O(n)$ s.t. $X=Pdiag(\lambda_i)P^T$ and $S=Pdiag(\sqrt{\lambda_i})P^T$; let $K=[k_{i,j}]=P^TS'P$ and $H=[h_{i,j}]=P^TX'P$.

We deduce the equation in $K$: $diag(\sqrt{\lambda_i})K+Kdiag(\sqrt{\lambda_i})=H$.

We obtain easily $k_{i,j}=\dfrac{h_{i,j}}{\sqrt{\lambda_i}+\sqrt{\lambda_j}}$ and $S'=PKP^T$.

ii) We use a real convergent integral $S'=\int_0^{\infty}e^{-tS}X'e^{-tS}dt$.

For the details, see my post in

Derivative (or differential) of symmetric square root of a matrix