Firstly, I'm but a mere physicist, so please be gentle :-) I want to explicitly show that the derivative of the (natural) logaritm of a general $n \times n$ (diagonalizable) matrix $X(x)$ w.r.t. $x$ is
$$\frac{\text{d}}{\text{d}x}\Big(\ln{\left[X(x)\right]}\Big) = X'(x)X^{-1}$$
where $X'(x)$ is the derivative of $X$ w.r.t. $x$.
I'm going about this in a similar way to how I would prove it for $X$ being just a scalar function of $x$, meaning I start from the definition of the derivative
$$ \newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{\ln{[X+\Delta X]}-\ln{X}}{\Delta x}} $$
where I rewrite $\Delta X = X'\Delta x$:
$$ \newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{\ln{[X+X'\Delta x]}-\ln{X}}{\Delta x}} $$
The idea is then to use some logarithm properties to get $e$ out of it$^1$:
$$\newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{1}{\Delta x}\Big(\ln{[XX^{-1}+X'X^{-1}\Delta x]}\Big)} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{1}{\Delta x}\Big(\ln{[\mathbb{I}+X'X^{-1}\Delta x]}\Big)} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\ln{\left[\left(\mathbb{I}+X'X^{-1}\Delta x\right)^{\frac{1}{\Delta x}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{U\rightarrow 0}{\ln{\left[\left(\mathbb{I}+U\right)^{X'X^{-1}U^{-1}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1}\lim_{U\rightarrow 0}{\ln{\left[\left(\mathbb{I}+U\right)^{U^{-1}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1}\lim_{U\rightarrow 0}{\ln{e}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1} $$
But I'm not at all convinced about all my steps there. Furthermore, I used the logarithm property $\ln{A}-\ln{B} = \ln{AB^{-1}}$ which only holds if $A$ and $B$ commute. I suppose in the limit of $\Delta x$ approaching zero, $\Delta X=X'\Delta x$ and $X^{-1}$ would commute (and $X$ and $X^{-1}$ always do), but I'd like to find out what a mathematician thinks of this.
Lastly I want to add that if I just assume the definition of the matrix logarithm as a power series$^2$,
$$\ln{X} = -\sum_{k=1}^{\infty}{\frac{1}{k}(\mathbb{I}-X)^k},$$
and then differentiate this series, I exactly find $X^{-1}X'$. Again the assumption has to be made, however, that $X$ and $\Delta X$ commute inside a limit.
So my question is: am I right to feel a bit sketchy about my attempt at an explicit proof for the derivative of the matrix logarithm? And can we generally assume $X$ and $\Delta X$ commute when the limit of small $\Delta X$ is to be taken?
If anyone feels particularly inclined, I was also wondering if the power series I've taken as the definition of the matrix logarithm above is indeed the definition and if so, why that one is chosen. Is it purely in analogy to the Taylor expansion of $\ln{x}$? If this would be better asked as a separate question, I'll go ahead and do that.
There's a fair amount of related questions on here already, but they haven't allowed me to figure out the answers to my questions in a way that I'm 100% sure I understand.
$^1$ By the way, can anyone tell me why the align-environment doesn't work on here? It works just fine for me on Physics.SE .
$^2$ Can anyone confirm that this series converges if $\max_{i}{|1-\lambda_i|} < 1$ ?
You can write $d\log X = dX\,X^{-1}$ if and only if $X$ and $dX$ commute. In that case, of course: $$ dX\,X^{-1} = X^{-1}dX. $$
In the general case they do not commute, and there is no simple rule for the derivative of the logarithm. Even though the expressions $dX\,X^{-1} $ and $X^{-1}dX$ are called "logarithmic derivatives", as they share some properties with the actual derivatives of the logarithm, they are not.
The reason behind this is that, for general matrices: $$ e^A\,dA\ne d(e^A) \ne dA\,e^A, $$
unless $A$ and $dA$ commute. This can be seen from the definition by the Taylor series: $$ d(e^A) = d \left( 1 + A + \frac{1}{2}A^2 +\dots \right) = 0 + dA + \frac{1}{2}A\,dA + \frac{1}{2}dA\,A +... $$
which is not equal to: $$ dA + dA\,A +...= dA (1+A+...) = dA\,e^A, $$
because $\frac{1}{2}(dA\,A+A\,dA)\ne dA\,A$ in general.
You might feel that if $dA$ is "small", then the commutator is "small". That is a dangerous assumption, the truth is that the commutator is the same order as $dA$, so it matters.