What is the derivative of a matrix-valued composite function?

141 Views Asked by At

Preliminaries: Jordan algebra

In the sense of Jordan algebra, the following arrow matrix is often used to express the Jordan product $\circ$ which will be defined later: For a vector $a\in\Bbb R^m$,

$$ \mathrm{Arw}(a):=\left[\begin{array}{cc} a_1 & a_{-1}^\top \\ a_{-1} & a_1 I \end{array}\right], $$

where $a_{-1}:=(a_2,\dots,a_m)^\top\in\Bbb R^{m-1}$.

For vectors $a,b\in\Bbb R^m$, the Jordan product $a\circ b$ of a and b is defined as

$$ a\circ b:=\left[\begin{array}{c} a^\top b \\ a_1b_{-1}+b_1a_{-1} \end{array}\right]\in\Bbb R^m. $$

Here, the Jordan product can also be written as follows using arrow matrices:

$$ a\circ b=\mathrm{Arw}(a)b=\mathrm{Arw}(b)a. $$

Question

I am struggling with the derivative of the vector-valued function with respect to $x\in\Bbb R^n$:

$$ f(x)\circ g(x), $$

where $f,g:\Bbb R^n\to \Bbb R^m$.

As I explained above, the question can be interpreted as the derivative of

$$ \mathrm{Arw}(f(x))g(x), $$

or

$$ \mathrm{Arw}(g(x))f(x). $$

However, I don't know the derivative of a composite matrix-valued function. Thus I cannot apply the chain rule in the equation because of my lack knowledge about matrix analysis especially in the derivative of $\mathrm{Arw}(f(x))$.

Moreover, the question is more generalized as the derivative of

$$ X(f(x)), $$

where $X:\Bbb R^m\to\Bbb S^m$ is the matrix-valued function from $\Bbb R^m$ to a space of symmetric matrices $\Bbb S^m$.

It would be very grateful if you help me.

2

There are 2 best solutions below

0
On

I will denote Jordan product by $\star$, since I'm too used to $\circ$ for composition. Notice that your function $\def\Arr{\mathrm{Arr}} \def\R{\mathbb{R}} \Arr:\R^n \to M_n\R$ is linear, so $$\Arr(a+h)=\Arr(a)+\Arr(h).$$ Hence, if $F(x)=f(x)\star g(x)=\Arr(f(x))g(x)$, you have, ignoring all terms where $h$ appears with order greater than $1$: $$\begin{align} F(x+h) &= \Arr(f(x+h))g(x+h) \\ &= \Arr(f(x)+f'(x)(h))[g(x)+g'(x)(h)] \\ &= [\Arr(f(x))+\Arr(f'(x)(h)][g(x)+g'(x)(h)] \\ &= \Arr(f(x))g(x)+\Arr(f(x))g'(x)(h)+\Arr(f'(x)(h))g(x) \\ &= F(x)+f(x)\star g'(x)(h)+f'(x)(h) \star g(x). \end{align}$$

Hence (again disregarding terms of higher order): $$\begin{align} F'(x)(h) &=F(x+h)-F(x) \\ &=f(x)\star g'(x)(h)+f'(x)(h) \star g(x). \end{align}$$

You might know $f'(x)$ as the Jacobian matrix $Jf(x)$. With this notation, you have $$\begin{align} F'(x)(h) &=f(x)\star (Jg(x)h) + (Jf(x)h) \star g(x). \end{align}$$


Another, more direct way to obtain the above result is to note that $\star$ is linear in each entry, so $$\begin{align} (a+c)\star b &= a\star b + c\star b \\ a\star(b+c) &= a\star b + a\star c. \end{align}$$ Hence $$\begin{align} F(x+h) &= f(x+h)\star g(x+h) \\ &= [f(x)+f'(x)(h)]\star[g(x)+g'(x)(h)] \\ &= F(x)+f(x)\star g'(x)(h)+f'(x)(h) \star g(x). \end{align}$$

0
On

The key ideas: $\;1)$ the product rule for differentials, $\;2)$ the Jordan product commutes

$$\eqalign{ \def\LR#1{\left(#1\right)} \def\gx#1{\frac{\partial #1}{\partial x}} h &= g\circ f \\ dh &= g\circ df &+\; dg\circ f \qquad &\{1\} \\ &= g\circ df &+\; f\circ dg \qquad &\{2\} \\ \gx{h} &= g\circ \gx f &+\; f\circ \gx g \\\\ }$$


Interestingly, the arrow function can be expressed using standard matrix notation, leading to a purely matrix result $$\eqalign{ F &= {\rm Arw}(f) \;\doteq\; fe^T + ef^T + \LR{e^Tf}\LR{I-2\,ee^T} \\ G &= {\rm Arw}(g) \,\;\doteq\; ge^T + eg^T + \LR{e^Tg}\LR{I-2\,ee^T} \\ \gx{h} &= G\LR{\gx f} + F\LR{\gx g} \\ }$$ where $e$ is the first euclidean basis vector.