Differential and derivative of the trace of a matrix

20.6k Views Asked by At

If $X$ is a square matrix, obtain the differential and the derivative of the functions:

  1. $f(X) = \operatorname{tr}(X)$,
  2. $f(X) = \operatorname{tr}(X^2)$,
  3. $f(X) = \operatorname{tr}(X^p)$ ($p$ is a natural number).

To find the differential I thought I could just find the differential of the compostion function first and then take the trace of that differential. Am I right in saying so? As for the derivative, I have no idea how I should do it for traces. Could anyone please help me out?

wj32's answer makes sense to me, however, I wonder if it is also possible to solve this question by using the ordinary way of finding differentials and derivatives, namely f(x+dx)-f(x). Is there someone who could maybe show me how this would be done (if possible)?

2

There are 2 best solutions below

0
On BEST ANSWER

1) The trace is linear and bounded (which is automatic in finite dimension) so its derivative is equal to itself everywhere $$ df_X(H)=\mbox{tr}(H). $$

2) This is the composition of the trace with the bounded bilinear map $g:(X,Y)\longmapsto XY$ whose derivative is $$ dg_{(X,Y)}(H,K)=g(X,K)+g(H,Y)=XK+HY. $$ and the bounded linear map $h:X\longmapsto (X,X)$ whose derivative is itself at every point.

So by the chain rule $$ df_X(H)=d\mbox{tr}_{X^2}\circ dg_{(X,X)}\circ dh_{X}(H)=\mbox{tr}(XH+HX). $$ And by commutativity of the trace, this yields $$ 2\mbox{tr}(XH). $$

3) This is a composition again, of the trace with the $p$-linear map $$ k:(X_1,\ldots,X_p)\longmapsto X_1\cdots X_p $$ and the linear map $l:X\longmapsto (X,\ldots,X)$.

The derivative of $k$ is $$ dk_{(X_1,\ldots,X_p)}(H_1,\ldots,H_p)=H_1X_2\cdots X_p+X_1H_2X_3\cdots X_p+\ldots+X_1\cdots X_{p-1}H_p. $$

And the derivative of $l$ is itself. So by the chain rule $$ df_X(H)=d\mbox{tr}_{X^p}\circ dk_{(X,\ldots,X)}\circ dl_X(H). $$ Using the commutativity of the trace, we find $$ p\mbox{tr} (X^{p-1}H). $$

3
On

$\newcommand{\tr}{\operatorname{tr}}$I'm not familiar with matrix-ey notation, so I'll just write down what I know.

Let $V$ be a finite-dimensional real vector space (or more generally a Banach space) and let $L(V)$ be the space of continuous linear operators on $V$. The trace is a linear map $\tr:L(V)\to\mathbb{R}$, so $D\tr(x)=\tr$ for every $x\in L(V)$. That answers your first question.

For the third question, we just use the chain rule. Define $p_n:L(V)\to L(V)$ by $p_n(x)=x^n$. You want \begin{align} D(\tr\circ p_n)(x)u &= (D\tr(x^n)\circ Dp_n(x))u \\ &= \tr\left(\sum_{k=0}^{n-1} x^kux^{n-k-1}\right) \\ &= \sum_{k=0}^{n-1}\tr\left(x^kux^{n-k-1}\right) \\ &= n\tr(x^{n-1}u). \end{align}


Lemma (Power rule). Let $E$ be a Banach algebra and let $p_n:E \to E$ be the map defined by $p_n(x)=x^n$. Then $$Dp_n(x)u=\sum_{k=0}^{n-1} x^kux^{n-k-1}.$$ In particular, if $E$ is commutative then $$Dp_n(x)u=nx^{n-1}u,$$ which is just the plain old power rule.

Proof. We use induction on $n$. The case $n=0$ is clear, so suppose the result holds for $n-1$. Since $p_n(x)=xp_{n-1}(x)$, the product rule shows that \begin{align} Dp_n(x)u &= up_{n-1}(x)+xDp_{n-1}(x)u \\ &= ux^{n-1}+x\sum_{k=0}^{n-2}x^kux^{n-k-2} \\ &= \sum_{k=0}^{n-1}x^kux^{n-k-1}. \end{align}