Confused about chain rule in Frechet derivative

246 Views Asked by At

I've recently started to learn about Frechet derivatives and now have a simple example which I'm not sure if I've solved correctly. To be honest, I've only got a poor understanding of how it works so please assume I am starting at zero. Given $\delta = \sum_i \lambda_iX_i$ for some matrices $X_i$ and real numbers $\lambda_i$ and $N(X)$ being a linear transformation on $X$, find the Frechet derivative

$$\frac{\partial}{\partial \lambda_l}Tr(A\log(N(\delta)),$$

for some $l$.


Redoing my attempted solution after reading greg's answer. Would appreciate if anyone can comment on the correctness of it!

We have \begin{align} d Tr(A\log(N(\delta)) &= Tr(dA\log(N(\delta)) \\ &= Tr\left(\left(\int_0^{\infty}\frac{1}{1+tN(\delta)}A\mathrm{d}t\frac{1}{1+tN(\delta)}\right)dN(\delta)\right) \\ &= Tr\left(\left(\int_0^{\infty}\frac{1}{1+tN(\delta)}A\mathrm{d}t\frac{1}{1+tN(\delta)}\right)N(d\delta)\right) \\ &= Tr\left(\left(\int_0^{\infty}\frac{1}{1+tN(\delta)}A\mathrm{d}t\frac{1}{1+tN(\delta)}\right)N(X_k d\lambda_k)\right) \\ \end{align}

I have used the fact that $Tr$ and $N()$ are both linear operators and the solution posted here to write the function $d\ (A\log X)$. I believe that this cannot proceed further unless I can say something about $N()$ to rewrite $N(X_k d\lambda_k)$ as $ M(X_k) d\lambda_k$. If and only if I could do that step, then I have

\begin{align} d Tr(A\log(N(\delta)) &= Tr\left(\left(\int_0^{\infty}\frac{1}{1+tN(\delta)}A\mathrm{d}t\frac{1}{1+tN(\delta)}\right)M(X_k) d\lambda_k\right)\\ \frac{\partial}{\partial\lambda_k}Tr(A\log(N(\delta)) &= \left(\left(\int_0^{\infty}\frac{1}{1+tN(\delta)}A\mathrm{d}t\frac{1}{1+tN(\delta)}\right)M(X_k)\right)^T \end{align}

1

There are 1 best solutions below

3
On BEST ANSWER

Define the matrix $$\eqalign{ Y &= \sum_{i=k}^N \lambda_kX_k = \lambda_kX_k \cr }$$ where the expression on the far RHS uses the index summation convention.

Now calculate the differential and derivative for the trace of a simple nonlinear function. $$\eqalign{ \phi &= {\rm Tr}\Big(Y^3\Big) \cr d\phi &= {\rm Tr}\Big(3Y^2\,dY\Big) = {\rm Tr}\Big(3Y^2X_k\,d\lambda_k\Big) \cr \frac{d\phi}{d\lambda_k} &= {\rm Tr}\Big(3Y^2X_k\Big) \cr }$$ A general function $f(Y)$ follows the same pattern. $$\eqalign{ \psi &= {\rm Tr}\Big(f(Y)\Big) \cr \frac{d\psi}{d\lambda_k} &= {\rm Tr}\Big(f'(Y)\,X_k\Big) \cr }$$ where $f'$ denotes the ordinary derivative of the function.