According to do Carmo, in Riemannian Geometry pages 49-50, he says let $\mathcal{X}(M)$ denote the set of all vector fields of class $C^{\infty}$ on $M$. Let $\mathcal{D}(M)$ denote the ring of all real-valued functions of class $C^{\infty}$ defined on $M$. An affine connection $\nabla$ on differential manifold $M$ is a mapping $\nabla : \mathcal{X}(M) \times \mathcal{X}(M) \rightarrow \mathcal{X}(M)$ which is denoted by $(X,Y) \xrightarrow{\nabla} \nabla_{X}Y$ and which satisfies the following properties:
- $\nabla_{fX+gY}Z = f\nabla_{X} Z+ g\nabla_{Y}Z$
- $\nabla_{X}(Y+Z) = \nabla_{X}Y + \nabla_{X}Z$
- $\nabla_{X}(fY) = f\nabla_{X}Y+ X(f)Y$
in which $X,Y,Z \in \mathcal{X}(M)$ and $f,g \in \mathcal{D}(M)$.
The first property simply is linear in the first argument $X$ right? In that case, $X$ happened to be defined as $fX + gY$ So why is the second argument different? By argument, I mean if I write the covariant derivative like $\nabla (X,Y)$, I can clearly see the linearity stated in property 1. Then property 2 looks like property 1 in this regard in that it satisfies the addition property. But the multiplication part is different. Instead of yielding just $\nabla_{X}(fY) = f\nabla_{X}Y$ it yields $\nabla_{X}(fY) = f\nabla_{X}Y+ X(f)Y$.
My Question:
Why don't we write $\nabla_{X}(fY) = f\nabla_{X}Y$? Why do we specify the product rule? Why is this important for affine connections?
Remark:
Consider the first property. We have $\nabla_{fX}Z=f\nabla_X Z$. That doesn't involve the product rule, does it? So why do we need it when we are applying it $\nabla_{X}(fZ)$?
$\newcommand{\Reals}{\mathbf{R}}\newcommand{\vec}{\mathbf{e}}\newcommand{\dd}{\partial}\newcommand{\Del}{\nabla}$Let's consider the special case $M = \Reals$, on which there's a non-vanishing vector field (i.e., a global frame) $\vec = \dd/\dd x$.
Let $\phi$ and $\psi$ be smooth functions, and $$ X = \phi \vec = \phi\, \frac{\dd}{\dd x},\qquad Y = \psi \vec = \psi\, \frac{\dd}{\dd x} $$ the associated vector fields.
Thinking of $\vec$ as a constant vector field (its value $(1, 0)$ is the same at every point), we have $\Del_{\vec} \vec = 0$. By ordinary calculus, $$ \Del_{X}Y = \phi \frac{\dd}{\dd x}\biggl(\psi \frac{\dd}{\dd x}\biggr) = \phi \psi'\, \frac{\dd}{\dd x}. \tag{1} $$
The central point is that $\phi$, the component of $X$ (with respect to the frame $\vec$), appears in the right-hand expression via its value, while $\psi$, the component of $Y$, appears via the value of its derivative.
To highlight the distinction more clearly, let $f$ and $g$ be smooth functions. By (1), $$ \Del_{fX}(gY) = (f\phi) (g\psi)'\, \frac{\dd}{\dd x}. \tag{2} $$ The factor $f$ in (2) is merely multiplied, and therefore factors out: $$ \Del_{fX}(gY) = (f\phi) (g\psi)'\, \frac{\dd}{\dd x} = f\biggl(\phi (g\psi)'\, \frac{\dd}{\dd x}\biggr) = f \Del_{X}(gY). $$ By contrast, the factor $g$ in (2) is inside a derivative, and consequently entails application of the product rule: \begin{align*} \Del_{fX}(gY) &= (f\phi) (g\psi)'\, \frac{\dd}{\dd x} \\ &= (f\phi) (g\psi' + g'\psi)\, \frac{\dd}{\dd x} \\ &= g(f\phi) \psi'\, \frac{\dd}{\dd x} + \bigl[(f\phi) g'\bigr] \psi)\, \frac{\dd}{\dd x} \\ &= g\Del_{fX} Y + \bigl[(fX)g\bigr] Y. \end{align*}
Conceptually, the point is that the vector fields $X$ and $Y$ in (1) do not play symmetric, interchangeable roles:
The field $X$ acts as a differential operator. The value of $\Del_{X} Y$ at some point $p$ depends only on the value $X(p)$.
The field $Y$ is the object being differentiated. The value of $\Del_{X} Y$ at some point $p$ depends on derivatives of the components of $Y$, i.e., on the values of $Y$ in a neighborhood of $p$.
Similar comments hold (for completely analogous reasons) on arbitrary manifolds, hence Do Carmo's axioms for an affine connection.
In case this distinction helps: The expression $\Del_{X} Y$ is bilinear over scalars: If $a$ and $b$ are real numbers, then $$ \Del_{aX} (bY) = (ab) \Del_{X} Y. $$ However, $\Del_{X} Y$ is not bilinear over functions, but only "linear in $X$". In the notation above, $$ \Del_{fX} Y = f \Del_{X}Y,\qquad \Del_{X} (gY) = \bigl[(\Del_{X})g\bigr]Y + g \Del_{X} Y = (Xg)Y + g \Del_{X} Y $$