The formula of matrix calculus

102 Views Asked by At

Suppose $A \in \mathbb{R}^{p \times p}$ is a semi-positive defined symmetric matrix. Then $A^{1/2}$ is well-defined. Now I want to know does there exists a formula for $$\frac{\partial A^{1/2} }{\partial A} ?$$ Thanks so much!

BTW, if we find an explicit formula for $\frac{\partial A^{1/2} }{\partial A}$, then does the following approximation holds? $$B^{1/2} - A^{1/2} = \frac{\partial A^{1/2} }{\partial A} (B - A)+ O(\|B-A\|_2^2) $$ for a proper $p \times p$ matrix $B$?

1

There are 1 best solutions below

1
On BEST ANSWER

$\def\p#1#2{\frac{\partial #1}{\partial #2}}$ The square root function is $$F(S) = S^{1/2}$$ For an SPD matrix, it can be squared, differentiated, vectorized, and inverted to yield $$\eqalign{ S &= F^2 \\ dS &= dF\,F + F\,dF \\ {\rm vec}(dS) &= (F\otimes I + I\otimes F)\;{\rm vec}(dF) \\ &= (F\oplus F)\;{\rm vec}(dF) \\ ds &= (F\oplus F)\,df \\ df &= (F\oplus F)^{-1}\,ds \\ &= G\,ds \\ \p{f}{s} &= G \\ }$$ where $(\otimes,\oplus)$ denote the Kronecker product and Kronecker sum, respectively.

In vectorized form, the first order Taylor expansion is $$\eqalign{ df &\approx G\,ds \quad\implies\quad f(s+ds)-f(s) &\approx G\,ds \\ }$$ The matrix form requires the gradient as a fourth-order tensor and the double-dot product $$\eqalign{ dF &\approx \Gamma:dS \quad\implies\quad dF_{ij} &\approx \sum_{k}\sum_{\ell}\,\Gamma_{ijk\ell}\;dS_{k\ell} \\ }$$ Converting between the matrix/tensor forms of the gradient $(G/\Gamma)$ is simple but tedious.
In general $$\eqalign{ G &\in {\mathbb R}^{mn\times pq} \quad\iff\quad \Gamma \in {\mathbb R}^{m\times n\times p\times q} \\ G_{\alpha\beta} &= \Gamma_{ijk\ell} \\ \alpha &= i+(j-1)\,m \\ \beta &= k+(\ell-1)\,p \\ i &= 1+(\alpha-1)\,{\rm mod}\,m \\ j &= 1+(\alpha-1)\,{\rm div}\,m \\ k &= 1+(\beta-1)\,{\rm mod}\,p \\ \ell &= 1+(\beta-1)\,{\rm div}\,p \\ }$$ However, in this particular case $\;m=n=q=p$


Extend this solution to semi-SPD matrices $(A,B)\,$ by writing them as perturbations $(\lambda,\mu\to 0)$ of $S$ in symmetric matrix directions $(X,Y)$ $$\eqalign{ A &= S + \lambda X, \qquad B &= S + \mu Y \\ }$$ Then we can use the SPD solution, with the gradient evaluated at $\Gamma=\Gamma(S),\,$ to write $$\eqalign{ F(B) - F(S) &\approx \Gamma:\mu Y \\ F(A) - F(S) &\approx \Gamma:\lambda X \\ }$$ Subtraction then yields the expected formula $$\eqalign{ F(B) - F(A) &\approx \Gamma:(\mu Y - \lambda X) \\ &\approx \Gamma:(B-A) \\ }$$