Derivative of function w.r.t to vector where matrix elements are functions itself

137 Views Asked by At

Consider the function $f: \mathbb{R}^N \to \mathbb{R}$ with

$$ f(r) = ||Ap - b||_2^{2} = p^{\top}A^{\top}Ap - 2p^{\top}A^{\top}b + b^{\top}b $$

where

  • $A \in \mathbb{R}^{N \times N}$ and $p,b \in \mathbb{R}^N$

  • The elements of the matrix $A$ are linear functions of the variables $r = (r_1, \ldots, r_N)$, i.e. $a_{i,j} : \mathbb{R}^{N} \to \mathbb{R}$ and $A = (a_{i,j}(r_1, \ldots, r_N))_{i = 1, \ldots, N, j = 1, \ldots, N}$.

  • I know the partial derivatives of each matrix element, i.e. $ \frac{\partial a_{i,j}}{\partial r_k}$ is known.

Is there any chance for a closed form of $\nabla f$? Any hints or help is really appreciated!

PS: This question is similar to this one. But there the matrix depends only on one variable.

1

There are 1 best solutions below

0
On BEST ANSWER

Let's use a colon to denote the trace/Frobenius product, i.e. $$\eqalign{ X,Y &\in {\mathbb R}^{m\times n} \\ X:Y &= {\rm Tr}(X^TY) = {\rm Tr}(Y^TX) = Y:X \\ }$$ or in terms of components $$\eqalign{ X:Y &= \sum_{i=1}^n\sum_{j=1}^m X_{ij}Y_{ij} \qquad&\big({\rm explicit\,summation}\big) \\ &= X_{ij}Y_{ij} \qquad&\big({\rm Einstein\,convention}\big) \\ }$$ The colon product can also be applied to vectors by setting $m=1$ and treating them as rectangular matrices.

The following matrix will turn out to be useful $$G = 2\left(Ap-b\right)p^T$$

Use this product to write the function and calculate its differential $$\eqalign{ f &= (Ap-b):(Ap-b) \\ df &= 2(Ap-b):(dA\;p) \\ &= 2\left(Ap-b\right)p^T:dA \\ &= G:dA \\ }$$ Switch to index notation and substitute the known derivative $\;H_{ijk} = \left(\frac{\partial A_{ij}}{\partial r_k}\right)$ $$\eqalign{ dA_{ij} &= H_{ijk}\,dr_k \\ df &= G_{ij}\,H_{ijk}\,dr_k \\ \frac{\partial f}{\partial r_k} &= G_{ij}\,H_{ijk} &\big({\rm gradient\,wrt\,}r\big) \\ }$$ or, using explicit summations, the components of the gradient are $$\eqalign{ \frac{\partial f}{\partial r_k} &= \sum_{i=1}^N\sum_{j=1}^N G_{ij} \left(\frac{\partial A_{ij}}{\partial r_k}\right) \\ }$$