Explicit proof of the derivative of a matrix logarithm

Question

Explicit proof of the derivative of a matrix logarithm

15.3k Views Asked by Bumbble Comm At 29 Mar 2026 - 3:16

Firstly, I'm but a mere physicist, so please be gentle :-) I want to explicitly show that the derivative of the (natural) logaritm of a general $n \times n$ (diagonalizable) matrix $X(x)$ w.r.t. $x$ is

$$\frac{\text{d}}{\text{d}x}\Big(\ln{\left[X(x)\right]}\Big) = X'(x)X^{-1}$$

where $X'(x)$ is the derivative of $X$ w.r.t. $x$.

I'm going about this in a similar way to how I would prove it for $X$ being just a scalar function of $x$, meaning I start from the definition of the derivative

$$ \newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{\ln{[X+\Delta X]}-\ln{X}}{\Delta x}} $$

where I rewrite $\Delta X = X'\Delta x$:

$$ \newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{\ln{[X+X'\Delta x]}-\ln{X}}{\Delta x}} $$

The idea is then to use some logarithm properties to get $e$ out of it$^1$:

$$\newcommand{\D}[2]{\frac{\text{d}#1}{\text{d}#2}} \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{1}{\Delta x}\Big(\ln{[XX^{-1}+X'X^{-1}\Delta x]}\Big)} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\frac{1}{\Delta x}\Big(\ln{[\mathbb{I}+X'X^{-1}\Delta x]}\Big)} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{\Delta x\rightarrow 0}{\ln{\left[\left(\mathbb{I}+X'X^{-1}\Delta x\right)^{\frac{1}{\Delta x}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = \lim_{U\rightarrow 0}{\ln{\left[\left(\mathbb{I}+U\right)^{X'X^{-1}U^{-1}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1}\lim_{U\rightarrow 0}{\ln{\left[\left(\mathbb{I}+U\right)^{U^{-1}}\right]}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1}\lim_{U\rightarrow 0}{\ln{e}} \\ \D{}{x}\Big(\ln{[X(x)]}\Big) = X'X^{-1} $$

But I'm not at all convinced about all my steps there. Furthermore, I used the logarithm property $\ln{A}-\ln{B} = \ln{AB^{-1}}$ which only holds if $A$ and $B$ commute. I suppose in the limit of $\Delta x$ approaching zero, $\Delta X=X'\Delta x$ and $X^{-1}$ would commute (and $X$ and $X^{-1}$ always do), but I'd like to find out what a mathematician thinks of this.

Lastly I want to add that if I just assume the definition of the matrix logarithm as a power series$^2$,

$$\ln{X} = -\sum_{k=1}^{\infty}{\frac{1}{k}(\mathbb{I}-X)^k},$$

and then differentiate this series, I exactly find $X^{-1}X'$. Again the assumption has to be made, however, that $X$ and $\Delta X$ commute inside a limit.

So my question is: am I right to feel a bit sketchy about my attempt at an explicit proof for the derivative of the matrix logarithm? And can we generally assume $X$ and $\Delta X$ commute when the limit of small $\Delta X$ is to be taken?

If anyone feels particularly inclined, I was also wondering if the power series I've taken as the definition of the matrix logarithm above is indeed the definition and if so, why that one is chosen. Is it purely in analogy to the Taylor expansion of $\ln{x}$? If this would be better asked as a separate question, I'll go ahead and do that.

There's a fair amount of related questions on here already, but they haven't allowed me to figure out the answers to my questions in a way that I'm 100% sure I understand.

$^1$ By the way, can anyone tell me why the align-environment doesn't work on here? It works just fine for me on Physics.SE .

$^2$ Can anyone confirm that this series converges if $\max_{i}{|1-\lambda_i|} < 1$ ?

Original Q&A

There are 3 best solutions below

**Bumbble Comm** · Answer 1 · 2014-03-23 14:40:40

You can write $d\log X = dX\,X^{-1}$ if and only if $X$ and $dX$ commute. In that case, of course: $$ dX\,X^{-1} = X^{-1}dX. $$

In the general case they do not commute, and there is no simple rule for the derivative of the logarithm. Even though the expressions $dX\,X^{-1} $ and $X^{-1}dX$ are called "logarithmic derivatives", as they share some properties with the actual derivatives of the logarithm, they are not.

The reason behind this is that, for general matrices: $$ e^A\,dA\ne d(e^A) \ne dA\,e^A, $$

unless $A$ and $dA$ commute. This can be seen from the definition by the Taylor series: $$ d(e^A) = d \left( 1 + A + \frac{1}{2}A^2 +\dots \right) = 0 + dA + \frac{1}{2}A\,dA + \frac{1}{2}dA\,A +... $$

which is not equal to: $$ dA + dA\,A +...= dA (1+A+...) = dA\,e^A, $$

because $\frac{1}{2}(dA\,A+A\,dA)\ne dA\,A$ in general.

You might feel that if $dA$ is "small", then the commutator is "small". That is a dangerous assumption, the truth is that the commutator is the same order as $dA$, so it matters.

**Bumbble Comm** · Answer 2 · 2017-01-06 01:43:17

A simple expression can be derived by manipulating the Taylor series $\ln X = \sum_{n=1}^\infty -\frac{(-1)^n}{n}(X-1)^n$ with the result $$\frac{d}{ds}\ln X(s) = \int_0^1 \frac{1}{1-t\,(1-X(s))} X'(s) \frac{1}{1-t\,(1-X(s))}\, dt\ .$$ While not in closed form, this formula can be easily computed numerically, for example. In the above expressions, 1 is the unit matrix.

To derive: $$\frac{d}{ds}\ln X(s) = -\sum_{n=1}^\infty \frac{(-1)^n}{n}\sum_{a=0}^{n-1}(X-1)^a X' (X-1)^{n-1-a}\\ =-\sum_{a=0}^\infty \sum_{n=a+1}^\infty \frac{(-1)^n}{n}(X-1)^a X' (X-1)^{n-1-a}\\ = -\sum_{a=0}^\infty\sum_{b=0}^\infty\frac{(-1)^{a+b+1}}{a+b+1}(X-1)^a X' (X-1)^{b}\\ = \sum_{a=0}^\infty\sum_{b=0}^\infty \int_0^1 dt\, t^{a+b}(1-X)^a X' (1-X)^{b}\ . $$ On performing the sums over $a$ and $b$ one gets the formula stated above.

**Bumbble Comm** · Answer 3 · 2023-09-10 17:44:06

Since $X$ is diagonalizable $$\eqalign{ \def\b{\beta} \def\M{M^{-1}} \def\BR#1{\Big[#1\Big]} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\diag#1{\op{diag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\qiq{\quad\implies\quad} X &= MB\M \qiq B = \Diag{\b_k} \\ }$$ you can use the $\sf Daleckii$-$\sf Krein$ Theorem $$\eqalign{ F &= f(X) \\ dF &= M\,\BR{R\odot\LR{\M\,dX\:M}}\,\M \\ R_{jk} &= \begin{cases} {\large\frac{f(\b_j)\,-\,f(\b_k)}{\b_j\,-\,\b_k}}\qquad{\rm if}\;\b_j\ne\b_k \\ \\ \quad{\small f'(\b_k)}\qquad\qquad{\rm otherwise} \\ \end{cases} \\ }$$ where the derivatives wrt $x$ have been abbreviated as $$\eqalign{ dF \equiv \frac{dF}{dx}\qquad\quad dX \equiv \frac{dX}{dx} }$$ $\odot$ denotes the Hadamard product, and $f(x)$ can be any function such that
$f(\b_k)\,$ and $f'(\b_k)$ are defined.

In this particular case, $f(x)=\log(x)$ and $f'(x)=x^{-1}\,$ so none of the eigenvalues
can be zero (i.e. the matrix must be invertible).

Explicit proof of the derivative of a matrix logarithm

There are 3 best solutions below

Related Questions in LOGARITHMS

Related Questions in PROOF-VERIFICATION

Related Questions in MATRIX-CALCULUS

Trending Questions

Popular # Hahtags

Popular Questions