How to differentiate functions in matrices

92 Views Asked by At

I have always been wondering about this. How does one actually differentiate a polynomial/function in matrices? I can see for elementary functions such as $f(X)=AX+B$ where $X$ is the matrix as a variable, you get $f'(X)=A$, but what about functions such as $$ f(X)=AXA^{-1}, $$ how can I calculate the derivative?

1

There are 1 best solutions below

2
On BEST ANSWER

Your function $f$ is still linear, so it's still its own derivative. In general you can do it by plugging in an infinitesimal matrix $X \mapsto X + \epsilon$ such that the coefficients of $\epsilon$ all square to zero. This is, to my mind, by far the fastest and cleanest way, and I don't know why it isn't taught like this. It is more or less equivalent to using big-O notation and expanding everything out to $O(\| \epsilon \|^2)$ but saves some notation.

Example #0: The derivative of $X^2$ is given by writing

$$(X + \epsilon)^2 = X^2 + \epsilon X + X \epsilon$$

so the derivative at $X$ is the linear map $\epsilon \mapsto \epsilon X + X \epsilon$ (you might have guessed that it's $2X \epsilon$ but this only works if $X$ and $\epsilon$ commute).

Example #1: The derivative of $X^{-1}$ is given by writing

$$(X + \epsilon)^{-1} = \left( X(1 + X^{-1} \epsilon) \right)^{-1} = (1 + X^{-1} \epsilon)^{-1} X^{-1} = (1 - X^{-1} \epsilon) X^{-1}$$

so the derivative at $X$ is the linear map $\epsilon \mapsto -X^{-1} \epsilon X^{-1}$ (you might have guessed that it's $-X^{-2} \epsilon$ but again this only works if $X$ and $\epsilon$ commute).

Example #2: The derivative of $\det(X)$ is given by writing

$$\det(X + \epsilon) = \det(X) \det(1 + X^{-1} \epsilon) = \det(X) \left( 1 + \text{tr}(X^{-1} \epsilon) \right)$$

so the derivative at $X$ is the linear map $\epsilon \mapsto \det(X) \text{tr}(X^{-1} \epsilon)$, and in particular when $X = 1$ it's the linear map $\epsilon \mapsto \text{tr}(\epsilon)$.

Example #3: The derivative of $\exp(X)$ is given by writing

$$\exp(X + \epsilon) = \sum_{n=0}^{\infty} \frac{(X + \epsilon)^n}{n!} = \sum_{n=0}^{\infty} \frac{X^n + \sum_{k=1}^n X^{k-1} \epsilon X^{n-k}}{n!}$$

so the derivative at $X$ is the linear map

$$\epsilon \mapsto \sum_{n \ge 0} \frac{\sum_{k=1}^n X^{k-1} \epsilon X^{n-k}}{n!}$$

which I don't believe can be simplified in general, although it ought to have some nice expression in terms of commutators, maybe after multiplying by $\exp(-X)$. If $X$ and $\epsilon$ commute this simplifies to $\exp(X) \epsilon$, and in particular if $X = 0$ this simplifies to $\epsilon$. Formalizing this example would require proving various things about convergence and exchange of limits but it can be done.