(Matrix Calculus) Chain Rule

482 Views Asked by At

Let $A \in \mathbb{R}^{n \times n}$ be an invertible matrix, $v \in \mathbb{R}^{n}$ and $\kappa: \mathbb{R}^{n} \rightarrow \mathbb{R} $ . What is $\frac{\partial\ \kappa(A^{-1}v)}{\partial\ A}$?

I've been trying all sorts of equations from the Matrix Cookbook, but none of them leads to success.

2

There are 2 best solutions below

6
On BEST ANSWER

For convenience, let's define two new vector variables $$\eqalign{ x &= A^{-1}v \cr g &= \frac{\partial\kappa}{\partial x} \cr }$$ Also, let's use a colon to denote the trace/Frobenius product, i.e. $$A:BC = {\rm tr}(A^TBC)$$ The properties of the trace give rise to lots of rules for rearranging the terms in a Frobenius product, e.g. $$\eqalign{ A:BC &= BC:A \cr &= AC^T:B \cr &= B^TA:C \cr }$$ Write the differential and gradient of the function in terms of these new variables $$\eqalign{ d\kappa &= g:dx \cr &= g:dA^{-1}\,v \cr &= -gv^T:A^{-1}\,dA\,A^{-1} \cr &= -A^{-T}gv^TA^{-T}:dA \cr \frac{\partial\kappa}{\partial A} &= -A^{-T}gv^TA^{-T} \cr }$$ From your other comments, we have an expression for $g$ which we can substitute $$\eqalign{ \frac{\partial\kappa}{\partial A} &= -A^{-T}(-\kappa x)v^TA^{-T} \cr &= \kappa A^{-T}A^{-1}vv^TA^{-T} \cr }$$

1
On

You can decompose your function as

$$ f = \kappa \circ h \circ g $$

where

$$ g(X) =X^{-1} \quad ; \quad h(X) = Xv . $$ In differential form

$$ d \kappa = \langle \nabla \kappa(\mathbb{x}), d\mathbb{x} \rangle \quad ; \quad d h = (dX) v \quad ; \quad d g = - X^{-1} (dX) X^{-1} . $$ Then, by applying chain rule we get differental of $f$

$$ d f = -\Big\langle \nabla \kappa(A^{-1}v), A^{-1} (dA) A^{-1} v \Big\rangle. $$

You can compute derivative in form of matrix

$$ \frac{\partial f}{\partial A} (A) = (x_{i,j} )^n_{i,j = 1}, $$

where each entry has a value

$$ x_{i,j} = -\Big\langle \nabla \kappa(A^{-1}v), A^{-1} X_{i,j} A^{-1} v \Big \rangle $$

with $X_{i,j}$ being a matrix with $1$ at position $i,j$ and $0$ everywhere else.