I have a class of probability densities that are indexed by $3$D rotation matrices. I am working on an estimation problem, so I want to find the fisher information, but that requires taking derivatives of the densities with respect to the rotation matrices.
Let $Q\in \mathbb{R}^{3\times 3}$ be a $3$D rotation matrix, and let $f(Q) = Y_1(Q)\cdots Y_n(Q) x \in R^{n}$ be a function that maps a rotation matrix to a vector in $\mathbb{R}^n$ where $Y_i(Q)$ are matrices that are a function of $Q$ and $x$ a constant vector, what is the derivative of $f$ with respect to $Q$:
$$\frac{\partial f}{\partial Q}$$.
Thank you!
In general, if you have a function $F:V\to W$ between two normed vector spaces (say over the field $\Bbb{R}$), then given a point $\alpha\in V$, the derivative $DF_{\alpha}$ (also commonly denoted as $dF_{\alpha}$) is by definition a linear transformation $V\to W$; i.e $DF_{\alpha} \in \mathcal{L}(V,W)$, so you can evaluate on a vector $v\in V$ to get $DF_{\alpha}(v) \in W$.
Now with this introduction out of the way, all the familiar rules of differential calculus (eg chain rule, product rule etc) all work as usual. In your case, $V= M_{n\times n}(\Bbb{R})$ and $W = \Bbb{R}^n$, and so the product rule (differentiate each function successively and leave all others untouched, in the same order) implies that \begin{align} Df &= \sum_{k=1}^n Y_1\cdots Y_{k-1} \cdot(DY_k)\cdot Y_{k+1} \cdots Y_n \cdot x. \end{align} This is a slightly condensed notation, where we do not say where things are being evaluated. If we want to be slightly more explicit, then for all $Q,\xi\in V$ we have \begin{align} Df_Q[\xi] &= \sum_{k=1}^n Y_1(Q)\cdots Y_{k-1} \cdot(DY_k)_Q[\xi]\cdot Y_{k+1}(Q) \cdots Y_n(Q) \cdot x \quad \in W \end{align} Without further information, this is about as simple as it can get.
This is a huge amount of information, because $Df_Q\in \mathcal{L}(V,W) = \mathcal{L}(M_{n\times n}(\Bbb{R}), \Bbb{R}^n)$ is a linear transformation between spaces of large dimension, and if you were to try to represent this linear transformation as a matrix, it would be very ugly very quickly (any matrix representation of this linear map will be of size $n\times n^2$), so unless you really need to write in matrix form, I would avoid it.