A proof of: The derivative of the determinant is the trace

6.8k Views Asked by At

I want to solve the following problem:

Show that the derivative of $\mbox{det}:GL(n,\mathbb{R})\rightarrow\mathbb{R}$ at $I\in GL(n,\mathbb{R})$ is given by $$\mbox{det}_{*}(I)(X)=\mbox{tr}X$$

I would like you to check my proof, and answer the question in the end.

My Attempt: I'll denote $N=GL(n,\mathbb{R})$ . Also, $\simeq$ will be used for vector space isomorphisms and $\cong$ will be used for diffeomorphisms.

We know that $\mbox{det}_{*}(I):T_{I}N\rightarrow T_{det(I)=1}\mathbb{R}$ .

Let $X\in T_{I}N$ . We can write $X$ in a basis of $T_{I}N$ . So let us find a basis $of T_{I}N$ : we know that $T_{I}N\simeq M_{n}(\mathbb{R})$ , so we can get a basis of $T_{I}N$ from a basis of $M_{n}(\mathbb{R})$ using an isomorphism. The function $$f:T_{I}N \rightarrow M_{n}(\mathbb{R}) \\ [\gamma] \mapsto \gamma'(0)$$

is known to be an isomorphism. Furthermore, ${E_{ij}}$ is a basis for $M_{n}(\mathbb{R})$ , where $E_{ij}$ is the $n\times n$ matrix whose entries are all zero except the entry $i,j$ , which is $1$ . Thus, a basis for $T_{I}N$ is ${f^{-1}(E_{ij})}$ . Now, $f^{-1}(E_{ij})$ is the equivalence class of curves $\gamma:\mathbb{R}\rightarrow N$ such that $\gamma(0)=I$ and $\gamma'(0)=E_{ij}$ . Hence, a representative of this equivalence class is $\alpha_{ij}(t)=I+tE_{ij}$ , and so we can write ${f^{-1}(E_{ij})}={[\alpha_{ij}]}$ .

Hence, we can write $X=\overset{n}{\underset{i,j=1}{\sum}}x_{ij}[\alpha_{ij}]$ .

Let us see how $det_{*}$ acts on the basis elements $[\alpha_{ij}]$ .

We have $\mbox{det}_{*}(I)([\alpha_{ij}])=[\mbox{det}\circ\alpha_{ij}]_{1}$

by definition of derivative (the subscript 1 reminds us that the equivalence relation of this equivalence class is different, since it is defined on the set of all curves of the type $\gamma:\mathbb{R}\rightarrow\mathbb{R}$ such that $\gamma(0)=\mbox{det}(I)=1 ).$

Now, $\mbox{det}\circ\alpha_{ij}:\mathbb{R}\rightarrow\mathbb{R}$ is such that $$\mbox{det}\circ\alpha_{ij}=\mbox{det}(\alpha_{ij}(t))=\mbox{det}\left(I+tE_{ij}\right)=\mbox{det}\left(\left[\begin{array}{ccc} 1 & & \mathbb{O}\\ & \ddots\\ \mathbb{O} & & 1 \end{array}\right]+\left[\begin{array}{cccc} \mathbb{O} & & & \mathbb{O}\\ & & t\,(i,j\mbox{ entry})\\ \\ \mathbb{O} & & & \mathbb{O} \end{array}\right]\right)$$ . The matrix is triangular (or simply diagonal), and so the determinant is the product of the diagonal elements. Hence, $\mbox{det}\circ\alpha_{ij}=1+t\delta_{ij}$ , with $\delta_{ij}$ the Kronecker delta.

Hence, $$\mbox{det}_{*}(I)([\alpha_{ij}])=[1+t\delta_{ij}]_{1}\in T_{1}\mathbb{R}$$

Finally, $$\mbox{det}_{*}(I)(X)=\overset{n}{\underset{i,j=1}{\sum}}x_{ij}\mbox{det}_{*}(I)([\alpha_{ij}])=\overset{n}{\underset{i,j=1}{\sum}}x_{ij}[1+t\delta_{ij}]_{1}$$

Now, I noticed that, if I for some reason use the isomorphism $$g:T_{1}\mathbb{R} \rightarrow \mathbb{R} \\ [\gamma]_{1} \mapsto \gamma'(0)$$

to “identify” $\alpha_{ij}$ with $g(\alpha_{ij})=\delta_{ij}$ and use that instead of $[1+t\delta_{ij}]_{1}$ , I get $\overset{n}{\underset{i,j=1}{\sum}}x_{ij}\delta_{ij}=\mbox{tr}X$ .

My question is: why is this last step (since "Now, I noticed...") legitimate?

2

There are 2 best solutions below

7
On BEST ANSWER

Since it seems you want to understand all the various identifications involved, let me introduce some notation to try and clarify what is going on.

Let $V$ be a finite dimensional real vector space (endowed with the natural smooth structure) and let $U \subseteq V$ be an open subset. Assume we are given a smooth function $F \colon U \rightarrow \mathbb{R}$. Then we can calculate three a priori distinct things:

  1. We can calculate the standard multivariable directional derivative of $F$ at a point $p \in U$ in the direction $v \in V$. I'll denote it by $$ DF|_{p}(v) := \lim_{t \to 0} \frac{F(p + tv) - F(p)}{t}. $$ Note that $DF|_{p} \colon V \rightarrow \mathbb{R}$.
  2. We can treat $U$ as a smooth manifold and calculate the differential of $F$ at a point $p \in U$. This will give us a map $dF|_{p} \colon T_p U \rightarrow \mathbb{R}$.
  3. We can treat both $U$ and $\mathbb{R}$ as smooth manifolds and calculate the "full differential" of $F$ at a point $p$ which I'll denote by $F_{*}$. This will give us a map $F_{*}|_{p} \colon T_p U \rightarrow T_{F(p)} (\mathbb{R})$.

What is the relation between the three distinct maps? Like you noted, we have natural isomorphisms $f \colon T_pU \rightarrow V$ and $g \colon T_{F(p)}(\mathbb{R}) \rightarrow \mathbb{R}$. In terms of those isomorphisms, we have the relations

$$ dF|_{p} = DF|_{p} \circ f, \,\,\, dF|_{p} = g \circ F_{*}|_{p}, \,\,\, g^{-1} \circ DF|_{p} \circ f = F_{*}|_{p}. $$

Now, in your case, we have $F = \det$ and $p = I$. By the formula you are given, it seems that you are asked to calculate $DF|_{p}$. Instead, you have tried to calculate $F_{*}|_{p}$, hence the need for isomorphisms $f$ and $g$ to relate the formula you are asked to show with your calculation. Note that like Qiaochu said in his comment, it is much easier to calculate $DF|_{p}(X)$ without choosing a basis but your more complicated calculation agrees with the expected result.

1
On

Another approach uses standard coordinate : note that, as a function of several variables, the determinant is particularly simple as it is linear in each coordinate.

If you developp the determinant using standard rule, you see that is $X=(x_{i,j})$ is a matrix and the index $i$ is fixed $\det M= \sum _{j=1}^n x_{i,j} \det X_{i,j} (-1)^{i+j}$, where $X_{i,j}$ is obtained by erasing the i-th raw and column of $X$.

If follows that $({ \partial \over \partial x_{i,j}} \det ) X= (-1)^{i+j} \det X_{i,j}$

In particular if $X= Id$ is the identity matrix, $({ \partial \over \partial x_{i,j}} \det ) Id=0$ if $i\not = j$ and $({ \partial \over \partial x_{i,i}} \det ) Id=1$

Whence $d \det ({Id}) M= \sum _{i,j} ({ \partial \over \partial x_{i,i}} \det )(Id) m_{i,j} =\sum _i m_{i,i}$

This approach immediately gives you the gradient of the determinant at any point $X$ : $\vec {grad} (\det )X$ is the matrix $(-1)^{i+j} \det X_{i,j}$