Timm Ahrendt, Schnelle Berechnung der Exponentialfunktion auf hohe Genauigkeit, Berlin: Logos 1999, section 3.2.3 "Gekoppelte Newton-Iteration für Quadratwurzeln" has an elegant division-free coupled iteration for the computation of square roots, for which an unpublished manuscript by A. Schönhage is cited as the reference.
Given $r_{approx}$, a low-precision approximation to $a^{-\frac{1}{2}}$, compute $s_{0}=a*r_{approx}$ and $r_{0}=r_{approx}/2$ then iterate
$$s_{i+1} = s_{i} + r_{i} \cdot (a - s_{i} \cdot s_{i}) \\ \space \space \space \space \space \space \space \space r_{i+1} = r_{i} + r_{i} \cdot (1 - r_{i} \cdot 2 \cdot s_{i+1})$$
where the $s_{i}$ are approximations to $\sqrt{a}$ and the $r_{i}$ are approximations to $\frac{1}{2}a^{-\frac{1}{2}}$. The convergence is quadratic, as the method is related to the Newton iteration. For arbitrary-precision computation this coupled iteration has the advantage that to compute an $m$-bit square root, one only needs the reciprocal square root accurate to about $m/2$ bits. When using binary floating-point computation, this scheme allow for the efficient use of fused multiply-add, with the $i$-th step requiring four FMA operations plus a doubling, which is exact.
I have searched the literature for a similar division-free coupled scheme for real-valued cube roots, but have come up empty-handed. I looked for papers addressing cube roots in particular, and also for the more general case of $n$-th root.
Are any division-free coupled iterations for cube roots available in the literature? The convergence should be at least quadratic. Particularly useful would be schemes that lend themselves naturally to the use of fused multiply-add (FMA), for maximum performance on modern hardware platforms.

For third order roots solving $f(x)=x^3-a$ you would have to approximate the inverse of the derivative $f'(x)^{-1}=\frac13x^{-2}$ with the sequence $r_i$.
So in analogy let $r_{approx}$ be a coarse approximation of $a^{-\frac23}$, set $s_0=a·r_{approx}$, $r_0=\frac13·r_{approx}$ and iterate $$ s_{i+1}=s_i+r_i·(a-s_i^3) $$ and compute $r_{i+1}$ as approximate solution to $$1=3r_{i+1}s_{i+1}^2=3(r_i+Δr_i)s_{i+1}^2$$ so that $3r_is_{i+1}^2=1-3Δr_is_{i+1}^2$ where the right side can be approximativly inverted to $$ 1=(1+3Δr_is_{i+1}^2)3r_is_{i+1}^2 $$ so that an approximate solution of quadratic error in similar form to the quadratic situation is \begin{align} r_{i+1}&=r_i(1+3Δr_is_{i+1}^2) \\ &=r_i+r_i·(1-3r_is_{i+1}^2) \end{align}
In general, $$ f'(s_{i+1})^{-1}=\frac{r_i}{1-[1-r_if'(s_{i+1})]} =r_i·(1+[1-r_if'(s_{i+1})]+O([1-r_if'(s_{i+1})]^2) $$ suggests the update of the approximate inverse derivative as $$r_{i+1}=2r_i-r_i^2f'(s_{i+1}).$$