I'm following the tutorial at this link, where the author states:
These follow from the various way one can iterate covariant derivative
$$\nabla^3_{xyz}s = \nabla^2_{xy}(\nabla_zs) - \nabla_{\nabla^2_{xy}z}s$$ and $$\nabla^3_{xyz}s = \nabla_x (\nabla^2)_{yz}s + \nabla^2_{yz}(\nabla_{x}s)$$
I'm unable to derive these equations myself.
My Attempt
For the second covariant derivative, I seem to be able to derive this. I used the product rule and the fact that we can commute the covariant derivative with contractions to show that:
$$\nabla_x\nabla_y s= \nabla_x C(\nabla s \otimes y) $$ $$\nabla_x\nabla_y s= C(\nabla_x \nabla s \otimes y) + C(\nabla s \otimes \nabla_x y ) $$ $$\nabla_x\nabla_y s= (\nabla_x\nabla s)(y) + \nabla_{\nabla_x y} s $$
I then assume that $(\nabla_x\nabla s)(y)$ is the second covariant derivative, so:
$$ \nabla^2_{xy}s = \nabla_x\nabla_y s - \nabla_{\nabla_x y } s $$
Now at this point we have one expression for the second covariant derivative. To get the next one, I just used the product rule to get:
$$ \nabla_x\nabla_y s = (\nabla_x\nabla)_y s + \nabla_{\nabla_xy}s + \nabla_y(\nabla_x s) $$ $$ \nabla_x\nabla_y s - \nabla_{\nabla_xy}s = (\nabla_x\nabla)_y s + \nabla_y(\nabla_x s) $$
$$ \nabla^2_{xy}s = (\nabla_x\nabla)_y s + \nabla_y(\nabla_x s) $$
But any attempts to do the same thing for the third covariant derivative seem to be failing for me. Is there some straightforward way to get to the results I quoted above from here that I'm not seeing?
My recommendation is to work out your own definitions and notation for everything. When you read someone else's writing, use their notation and proof as a guide to how to write everything including the proof in your own notation. Don't worry about understanding their notation literally.
I find higher covariant derivatives to be very confusing. The way I deal with it is that I view the covariant derivative of a higher order covariant derivative to be just a special case of the covariant derivative of a tensor. For example, the covariant derivative of a second order tensor $T$ is defined to be $$ (\nabla T)(X,Y,Z) = \partial_X(T(Y,Z)) - T(\nabla_XY,Z) - T(Y,\nabla_XZ) $$ So the second order covariant derivative of $T$ is \begin{align*} (\nabla^2T)(X,Y,Z,W) &= (\nabla(\nabla T))(X,Y,Z,W)\\ &=\partial_X(\nabla T(X,Y,Z)) - \nabla T(\nabla_XY,Z,W)\\ &\quad - \nabla T(Y,\nabla_XZ,W) - \nabla T(Y,Z,\nabla_XW) \end{align*} And so on.
Therefore, $\nabla^3T = \nabla(\nabla(\nabla T))) = \nabla^2(\nabla T) = \nabla(\nabla^2T)$. Now skew-symmetrization and the Ricci identity should give you what you want.
Note that my personal convention is to never write $\nabla_XT(Y,Z)$. I find that notation difficult to work with, even though I like the way the chain rule identity looks using that notation: $$ \partial_X(T(Y,Z)) = \nabla_XT(Y,Z) + T(\nabla_XY,Z) + T(Y,\nabla_XZ) $$