Derivative of a composite function $\mathbb{R} \to \mathbb{R}^{n\times n} \to \mathbb{R} $

429 Views Asked by At

I have a rotation matrix $\boldsymbol R \in \mathbb{R}^{3\times 3} $ which is a function of a certain angle parameter $\alpha$.

The roll-pitch-yaw angles can be extracted from the matrix, through some trigonometric functions of its entries.

For instance, one of such angles $\phi$ is then a function of the matrix $\boldsymbol R$, i.e. $\phi = f\left( \boldsymbol R\right) $.

I have to compute the derivative of $\phi$ with respect to $\alpha$:

$$\frac{\mathrm d \phi \left( \alpha \right )}{\mathrm d \alpha} = \frac{\mathrm d f\left( \boldsymbol R \left( \alpha \right )\right) }{\mathrm d \alpha} $$

How can I compute this derivative?

The following chain rule $$\frac{\mathrm d \phi \left( \alpha \right )}{\mathrm d \alpha} = \frac{\mathrm d f\left( \boldsymbol R \left( \alpha \right )\right) }{\mathrm d \boldsymbol R \left( \alpha \right )} \frac{ \mathrm d \boldsymbol R \left( \alpha \right )}{\mathrm d \alpha}$$ cannot be applied, because both these derivatives are 3x3 matrices while the result is, of course, a scalar.

All I have is the expression of function $f$ and the derivative of the matrix with respect to the angle $$\frac{ \mathrm d \boldsymbol R \left( \alpha \right )}{\mathrm d \alpha} = \boldsymbol S\left(\boldsymbol \delta\right) \boldsymbol R \left( \alpha \right )$$ where $\boldsymbol S\left( \boldsymbol \delta \right) $ is a skew symmetric matrix associated to cross product: $$ \boldsymbol S\left(\boldsymbol \delta\right) \boldsymbol v = \boldsymbol \delta \times \boldsymbol v$$ and, of course, some information about the structure, like $ \boldsymbol R^{-1} = \boldsymbol R^{T}$ or $\det (\boldsymbol R) = 1 $

EDIT:

As I feared, and as Bill Wallis's comment confirmed, the derivative may be simple using the vectorization $$\boldsymbol r := \begin{pmatrix} \boldsymbol c_{1} \\ \boldsymbol c_{2} \\ \boldsymbol c_{3} \end{pmatrix} $$ where $$\boldsymbol R = \begin{pmatrix} \boldsymbol c_{1} &\boldsymbol c_{2} & \boldsymbol c_{3} \end{pmatrix} $$ so that the derivative is simply the dot product between two vectors which is equivalent to consider all the entries as separate functions:

$$\frac{\mathrm d \phi \left( \alpha \right )}{\mathrm d \alpha} = \sum_{i=1}^{3}\sum_{j=1}^{3}\frac{\mathrm d \phi }{\mathrm d r_{ij} } \frac{ \mathrm d r_{ij}}{\mathrm d \alpha}$$

In any case, I still wonder whether a reasonable chain rule for derivatives involving matrices could exist, because this solution does not allow to exploit the information I already have about the derivative of the matrix.

2

There are 2 best solutions below

2
On BEST ANSWER

For typing convenience, let $$G=\frac{\partial\phi}{\partial R}$$ We also know that the differential of $R$ in terms of $\alpha$ is $$dR = SR\,d\alpha$$ Use these two pieces of information to find the differential of $\phi$ and then its gradient $$\eqalign{ d\phi &= G:dR \cr &= G:SR\,d\alpha \cr \frac{d\phi}{d\alpha} &= G:SR \cr }$$ where a colon is used to denote the trace/Frobenius product, i.e. $\,\,\,A\!:\!B={\rm tr}(A^TB)$

1
On

"I still wonder whether a reasonable chain rule for derivatives involving matrices could exist, because this solution does not allow to exploit the information I already have about the derivative of the matrix."

If such a thing exists, then I don't know about it. However, by identifying $\mathbb{R}^{n\times n}$ with $\mathbb{R}^{n^{2}}$ as in the comments, you can still preserve the information that you have about the matrix and its derivative.

To see this, suppose $\det(\boldsymbol{R}) = 1$. This gives you a relationship between the entries of $\boldsymbol{R}$; in the two-dimensional case, if $$ \boldsymbol{R} = \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix}, $$ then $$ \det(\boldsymbol{R}) = 1 \iff ad - bc = 1. $$ By by rewriting $\boldsymbol{R}$ as the vector $(a, b, c, d) \in \mathbb{R}^{4}$, you still have a relationship between the variables, much like the coordinates of a $2$-sphere $S^{2}$ are typically $(x, y, z) \in \mathbb{R}^{3}$ with the relationship $x^{2} + y^{2} + z^{2} = 1$.

When you compute the derivative, the relationships allows you to rewrite some derivatives in terms of the others, which is how you exploit their relationship. So, if each entry in $\boldsymbol{R}$ is a function of $t$, then $$ \frac{\mathrm{d}\boldsymbol{R}}{\mathrm{d}t} = \frac{\partial\boldsymbol{R}}{\partial a}\frac{\mathrm{d}a}{\mathrm{d}t} + \frac{\partial\boldsymbol{R}}{\partial b}\frac{\mathrm{d}b}{\mathrm{d}t} + \frac{\partial\boldsymbol{R}}{\partial c}\frac{\mathrm{d}c}{\mathrm{d}t} + \frac{\partial\boldsymbol{R}}{\partial d}\frac{\mathrm{d}d}{\mathrm{d}t}. $$ But $ad - bc = 1$ implies that $d = (1 + bc)/a$, so $$ \frac{\mathrm{d}d}{\mathrm{d}t} = \frac{\partial d}{\partial a}\frac{\mathrm{d}a}{\mathrm{d}t} + \frac{\partial d}{\partial b}\frac{\mathrm{d}b}{\mathrm{d}t} + \frac{\partial d}{\partial c}\frac{\mathrm{d}c}{\mathrm{d}t}. $$ I admit this can get messy, but working in $\mathbb{R}^{n^{2}}$ is analgolous to working in $\mathbb{R}^{n\times n}$ and you can still exploit their relationships in this form.

I would also like to note that I've never done this before, so I do not claim that this is the best method at all. It's just what I would do.