Derivative of a function with respect to a lower triangular matrix

300 Views Asked by At

I have a scalar-valued function $f$ defined on the space of $n \times n$ matrices. I have an analytic expression for the gradient, $\nabla f$.

Now, suppose I define a function, $g$, on the space of lower triangular matrices as $g(L) := f(LL^T)$. I'd like to derive an analytic expression for the gradient of $g$. I assume some sort of chain rule should go through so that I can get a rather simple expression using $\nabla f$, but I'm not sure how to proceed.

Again, thanks for your help.

1

There are 1 best solutions below

2
On BEST ANSWER

For ease of typing, denote the gradients of $f(X)$ and $g(L)$ as $$\eqalign{ F &= \frac{\partial f}{\partial X} = \nabla f \\ G &= \frac{\partial g}{\partial L} \\ }$$ First, calculate the differential of $X$ in terms of $L$. $$\eqalign{ X &= LL^T \quad\implies\quad dX = dL\,L^T + L\,dL^T \\ }$$ Then write the differential of the function and perform a change of variables from $X\to L$. $$\eqalign{ dg &= df \\ &= F:dX \\ &= F:(dL\,L^T+L\,dL^T) \\ &= (F+F^T):(dL\,L^T) \\ &= (F+F^T)L:dL \\ \frac{\partial g}{\partial L} &= (F+F^T)L \;=\; G \\ \\ }$$ In the above, a colon is used as a product notation for the trace, i.e. $$\eqalign{A:B = {\rm Tr}(A^TB) = {\rm Tr}(B^TA) = B:A}$$ The terms in such a product can be rearranged in a number of equivalent ways, e.g. $$\eqalign{ A:B &= A^T:B^T \\ A:BC &= B^TA:C = AC^T:B \\ }$$ due to the properties of the trace function.