Is there a general procedure to take the derivative of an analytic function of a matrix $f(\mathbf{X}) : \mathbb{C}^{n\times n} \rightarrow \mathbb{C}^{n\times n}$ (not the element-wise application of a function, but the matrix function defined with the Taylor series or other equivalent means), with respect to each element of the matrix argument?
Matrix function derivative with respect to matrix elements
814 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
$\def\p#1#2{\frac{\partial #1}{\partial #2}}\def\e{\varepsilon}\def\R#1{\in{\mathbb R}^{#1}}$Coordinate-wise derivatives are a useful approach which avoids higher-order tensors or transformations (i.e. vectorization) which flatten those tensors into matrices.
First, given the matrix variable $X\R{n\times n}\,$ its coordinate-wise derivatives are $$\eqalign{ \p{X}{X_{ij}} &= e_i e_j^T \;\doteq\; E_{ij} \\ }$$ where $e_i$ is a cartesian basis vector and $E_{ij}$ is the single-entry matrix.
Second, given a function defined by the Taylor series
$$F = \sum_{k=0}^\infty \alpha_k X^k$$
then, assuming the series converges for the given $X,\,$ its coordinate-wise derivatives are
$$\p{F}{X_{ij}} = \sum_{k=1}^\infty \alpha_k \left(\sum_{\ell=1}^{k} X^{k-\ell}E_{ij}X^{\ell-1}\right)$$
To illustrate what can happen, suppose we take the very innocent looking function $$ F(X)=X^n, $$ where $n$ is a positive integer. Then $$ F'(X)(H) = \frac d{dt}\Big|_{t = 0}(X+tH)^n = $$$$ = HX^{n-1} + XHX^{n-2} + X^2HX^{n-3} + \cdots + X^{n-2}HX + X^{n-1}H = $$$$ = \sum_{k=1}^{n-1} X^kHX^{n-1-k}. $$
In other words, the fact that matrix multiplication is non-commutative substantially complicates things. On the other hand, when we are differentiating a function of the form $$ F(X) = \text{tr}(f(X)), $$ things work much better since the trace makes up for the lack of commutativity. See Derivative of trace function.