Jacobian of product of a matrix and a vector functions

409 Views Asked by At

Let's say I have a $m\times m$ matrix function $A=(a_{ij})$, where each $a_{ij}:\mathbb R^n\to\mathbb R$ is a scalar function. Let's say I also have a vector valued function $f:\mathbb R^n\to\mathbb R^m$. Then we can define another vector valued $g:\mathbb R^n\to\mathbb R^m$ such that $g=Af$, where, for each $x\in\mathbb R^n$, $(Af)(x)$ would be the product of the matrix $A(x)$ with the vector $f(x)$.

Is there any relation between the Jacobian of $g$, $J(g)$, and $A$ and $f$?

I ask for a relation in the general case, but the question arose working with the Jacobian of $f=(f_1,\ldots,f_n)$ itself, $J(f)$ being the matrix $A$ in this scenario. The notes I was reading said that if $g=J(f)f$ then we would have

$$J(g) = J(f)J(f)^{\text{T}}+\sum_{i=1}^mH(f_i)f_i$$

where $H(f_i)$ would be the Hessian of $f_i:\mathbb R^n\to\mathbb R$. I've been trying to derive this myself, and I think the transpose written there is wrong, and it should be applied to the Hessians (maybe?).

Any thoughts on this?

1

There are 1 best solutions below

8
On BEST ANSWER

I would suggest you not think only in terms of matrices because this very quickly gets unwieldy. We still have a product rule in this case, namely for any $x,\xi\in\Bbb{R}^n$, \begin{align} Dg_x[\xi]&= (DA_x[\xi])\cdot f(x) + A(x)\cdot (Df_x[\xi]) \end{align} The meaning of this is that for each point $x$ in the domain of the functions,

  • $Dg_x\in L(\Bbb{R}^n,\Bbb{R}^m)$ is a linear transformation (hence by a choice of basis can be represented as an $m\times n$ matrix called the Jacobian matrix $Jg_x$; but I would highly suggest you avoid matrices whenever possible). So, it eats a vector $\xi\in\Bbb{R}^n$ and spits out a vector $Dg_x[\xi]\in\Bbb{R}^m$.
  • $DA_x\in L(\Bbb{R}^n, M_{m\times m}(\Bbb{R}))$ is a linear transformation. This is a linear transformation which eats a vector $\xi\in\Bbb{R}^n$ and spits out a matrix $DA_x[\xi]\in M_{m\times m}(\Bbb{R})$. Hence, in the first term of my equation above I was able to multiply this matrix by the vector $f(x)\in\Bbb{R}^m$. The fact that $DA_x$ is a linear transformation between $\Bbb{R}^n$ and $M_{m\times m}(\Bbb{R})$ means that it is rather unwieldy to assign a matrix representation to this; particularly because if you want to "vectorize" $M_{m\times m}(\Bbb{R})$, you would have to make a choice of the ordering of the elements when you decide to identify with $\Bbb{R}^{m^2}$, and there are sooo many possible conventions here. So, this is why one always encounters so many formulae when dealing with derivatives of matrices: it all stems from the desire to express into a matrix something which shouldn't be expressed into a matrix.
  • $Df_x\in L(\Bbb{R}^n,\Bbb{R}^m)$ is a linear transformation which eats a vector $\xi\in\Bbb{R}^n$ and spits out a vector $Df_x[\xi]\in\Bbb{R}^m$.

I would suggest you take a look at this answer of mine for a general product rule.


Anyway, if you wish to be an odd-ball and express the first equation with a bunch of indices, then we have for all $i\in\{1,\dots, m\},j\in\{1,\dots, n\}$, \begin{align} \frac{\partial g_i}{\partial x^j}(x)&=\sum_{k=1}^m\frac{\partial A_{ik}}{\partial x^j}(x)\cdot f_k(x) + \sum_{k=1}^mA_{ik}(x)\cdot \frac{\partial f_k}{\partial x^j}(x). \end{align} So, the fact that $\frac{\partial A_{ik}}{\partial x^j}$ has three indices is already an indication that matrix notation is not suitable for the task at hand.