I have the exponential coordinate/angle-axis vector $\mathbf{w} \in \mathbb{R}^3$ which is composed of two other angle-axis vectors $\mathbf{w}_0$ and $\mathbf{w}_1$:
$\mathbf{w} = Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1))$,
where $Exp(\mathbf{w}) = exp([\mathbf{w}]_\times)$,
$ [\mathbf{w}]_\times = \begin{bmatrix} 0 & -w_z & w_y \\ w_z & 0 & -w_x \\ -w_y & w_x & 0 \end{bmatrix}$,
and $Log(Exp(\mathbf{w})) = \mathbf{w}$.
I want to find an expression for the partial derivative, $\frac{\partial \mathbf{w}}{\partial \mathbf{w}_0}$ that can be implemented efficiently. From my understanding, in the case when $\mathbf{w}_0$ is small, we can make the following approximation:
$Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1)) \approx \mathbf{w}_1 + \mathbf{J}_l^{-1}(\mathbf{w}_1)\mathbf{w}_0 \rightarrow \frac{\partial \mathbf{w}}{\partial \mathbf{w}_0} \approx \mathbf{J}_l^{-1}$
where $\mathbf{J}_l$ is the left Jacobian from this paper. But what happens in the case when both $\mathbf{w}_0$ and $\mathbf{w}_1$ are large?
I am also looking for an approximation to $\frac{\partial \mathbf{w}}{\partial \mathbf{w_1}}$
I eventually figured it out. For $\frac{\partial \mathbf{w}}{\partial \mathbf{w}_0}$ we have:
$ \begin{align} Log(Exp(\mathbf{w}_0 + \delta)Exp(\mathbf{w}_1)) &\approx Log(Exp(\mathbf{w}_0)Exp(\mathbf{J}_r(\mathbf{w}_0)\delta)Exp(\mathbf{w_1})) \\ &= Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1)Exp(Ad_{Exp(\mathbf{w}_1)}^{-1}\mathbf{J}_r(\mathbf{w}_0)\delta)) \\ &\approx Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1)) + \mathbf{J}_r^{-1}(\mathbf{w})Ad_{Exp(\mathbf{w}_1)}^{-1} \mathbf{J}_r(\mathbf{w}_0)\delta \\ &= \mathbf{w}+\mathbf{J}_r^{-1}Exp(-\mathbf{w}_1)\mathbf{J}_r(\mathbf{w}_0)\delta \end{align} $
So the gradient is $\mathbf{J}_r^{-1}Exp(-\mathbf{w}_1)\mathbf{J}_r(\mathbf{w}_0)$.
For $\frac{\partial \mathbf{w}}{\partial \mathbf{w}_1}$:
$ \begin{align} Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1+\delta) &\approx Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1)Exp(\mathbf{J}_r(\mathbf{w}_1)\delta)) \\ &\approx Log(Exp(\mathbf{w}_0)Exp(\mathbf{w}_1) + \mathbf{J}_r^{-1}(\mathbf{w})\mathbf{J}_r(\mathbf{w}_0)\delta \\ &= \mathbf{w} + \mathbf{J}_r^{-1}(\mathbf{w})\mathbf{J}_r(\mathbf{w}_0)\delta \end{align} $
so the gradient is $\mathbf{J}_r^{-1}(\mathbf{w})\mathbf{J}_r(\mathbf{w}_0)$