I'm reading the book "The Ricci Flow: An Introduction" and I'm at the part where the authors prove the Bianchi identities using the diffeomorphism invariance of the curvature. I'm stuck on some computations from the paragraph below:
Consider the scalar curvature operator $g \mapsto R_g$ and it's linearization $DR_g$ defined by
$$D R_{g}(h)=-g^{i j} g^{k \ell}\left(\nabla_{i} \nabla_{j} h_{k \ell}-\nabla_{i} \nabla_{k} h_{j \ell}+R_{i k} h_{j \ell}\right) \ \ \ \ \ (1)$$
for any $2$ tensor $h$. Substituting
$$h_{i j}=\left(\mathcal{L}_{X} g\right)_{i j}=\nabla_{i} X_{j}+\nabla_{j} X_{i}$$
(where $X$ is an arbitrary vector field) and commuting covariant derivatives yields
$$\begin{align} D R_{g}\left(\mathcal{L}_{X} g\right) &=-2 \Delta \nabla_{i} X^{i}-2 R_{i j} \nabla^{i} X^{j}+\nabla^{i} \nabla_{j} \nabla_{i} X^{j}+\nabla_{i} \nabla_{j} \nabla^{j} X^{i} \ \ (2)\\ &=2 X^{i} \nabla^{j} R_{i j} \ \ (3) \end{align}$$
I don't understand how to go from $(1)$ to $(2)$. Commuting derivatives, we get (where I'm using the obvious notation $\nabla_{j, k} = \nabla_j \nabla_k$):
$$\nabla_{j, k} X_{\ell} - \nabla_{k, j} X_{\ell} = R_{jks}^{\ell} X^{s}$$
and with some work we can substitute $h = \mathcal{L}_{X} g$ into $(1)$ and obtain:
$$DR_g(h) = -g^{ij}g^{kl}\left( \nabla_{i} \left(R_{jks}^{\ell} X^{s} \right)- \nabla_{i}\left(R_{j{\ell}s}^{k} X^{s}\right) +R_{i k} h_{j \ell}\right) $$
but I still can't get from here to $(2)$. Nor can I see how $(3)$ follows from $(2)$. I've been stuck at this for a while now and would really appreciate some help.
You don't have to commute derivatives to go from (1) to (2); just directly plug in the choice of $h_{ij}$. That is, \begin{align*} g^{ij}g^{kl}\nabla_i\nabla_j h_{kl} & = \Delta g^{kl} h_{kl} = 2\Delta\nabla_i X^i, \\ g^{ij}g^{kl}\nabla_i\nabla_k h_{jl} & = \nabla^i\nabla^k h_{ik} = \nabla^i\nabla^k\nabla_i X_k + \nabla^i\nabla^k\nabla_k X_i, \\ g^{ij}g^{kl} R_{ik}h_{jl} & = R^{ik}h_{ik} = 2R^{ik}\nabla_iX_k ; \end{align*} the first and third lines here use the symmetry of $h_{ij}$.
To go from (2) to (3), first use the definition of the Ricci curvature to write $$ \nabla_i\nabla_j\nabla^j X^i = \nabla_j\nabla_i\nabla^j X^i - R_{ij}\nabla^j X^i + R_{ij}\nabla^i X^j = \nabla_j\nabla_i\nabla^j X^i . $$ Use the definition of the Ricci curvature again to write $$ \nabla_j\nabla_i\nabla^j X^i = \Delta\nabla_i X^i + \nabla_j(R_{ij}X^i) . $$ Plugging these into (2) in order yields \begin{align*} DR_g(\mathcal{L}_Xg) & = -2\Delta\nabla_iX^i + 2\nabla^i\nabla_j\nabla_i X^j - 2R_{ij}\nabla^iX^j \\ & = 2\nabla^j(R_{ij}X^i) - 2R_{ij}\nabla^i X^j \\ & = 2X^i\nabla^jR_{ij} , \end{align*} where the last equality follows from the product rule. This gives (3).