Motivation: I wish to build up linear connections on manifolds using parallel propagators, as opposed to covariant derivatives or horizontal bundles. I have no such textbooks that would follow this approach entirily and also be modern enough, so most things here are self-built. I have some problems and uncertainties though, which I haven't been able to solve.
Definitions: Let $(M,\tau,\mathcal{A})$ be a smooth $n$ dimensional manifold. To every smooth curve $\gamma:[t_0,t_1]\rightarrow M$, let us associate a linear map $P_\gamma(t_0,t_1):T_{\gamma(t_0)}M\rightarrow T_{\gamma(t_1)}M$ called the parallel propagator, satisfying the following properties:
- $P_\gamma(t_0,t_0)=\text{Id}$,
- $P_\gamma(t',t_1)P_\gamma(t_0,t')=P_\gamma(t_0,t_1)$ for any $t_0\le t\le t_1$,
- $P_\gamma(t_0,t_1)^{-1}=P_\gamma(t_1,t_0)$,
- $P$ depends on $\gamma$, $t_0$ and $t_1$ on a smooth manner.
This last property is hard to quantify rigorously, as such I assume the following definition of smoothness will suffice: $P$ is smooth in $\gamma,t_0$ and $t_1$ if for any chart $(U,\psi)$, and any smooth curve $\gamma$ for which $\gamma(t_0)$ and $\gamma(t_1)$ are contained in $U$, the matrix $P_\gamma(t_0,t_1)^\mu_{\ \ \nu}$ depends smoothly on $\gamma^\mu(t)$, $t_0$ and $t_1$, where the matrix is the matrix of $P_\gamma(t_0,t_1)$ with respect to the bases $\partial_\mu|_{\gamma(t_0)}$ in the starting space and $\partial_\mu|_{\gamma(t_1)}$ in the target space.
The covariant derivative of a smooth vector field $Y$ defined around $p$ along $X=\gamma'(0)$ ($\gamma(0)=p$) is defined as $$ \nabla_XY|_p=\left.\frac{d}{dt}P_\gamma(t,0)Y_{\gamma(t)}\right|_{t=0}. $$
Problem 1: The first problem starts here. The formula for the covariant derivatives involves a one-parameter family of tangent vectors at $p$, so the derivative makes sense. The usual behaviour of $\nabla_XY$ in $Y$ can also be observed from this formula. The problem is, I cannot observe the behaviour in the argument $X$, namely that it depends only on $X$ at $p$ and nowhere else.
To solve this problem, I thought about looking at the coordinate expression: $$ (\nabla_XY)^\mu=\left.\frac{d}{dt}\left(P_\gamma(t,0)^\mu_{\ \ \nu}Y^\nu(\gamma(t))\right)\right|_{t=0}=X^\sigma\partial_\sigma Y^\mu+Y^\nu\left.\frac{d}{dt}P_\gamma(t,0)^\mu_{\ \ \nu}\right|_{t=0}, $$ the problem is, I don't understand how (the matrix of) $P$ depends on the curve. It is clear that $P_\gamma$ depends entirily on the curve, but also it is probably not a "functional" of the curve, since it also depends explicitly on the curve's parameter too. If I accept that I can use the chain rule as $$ \frac{d}{dt}P_\gamma(t,0)^\mu_{\ \ \nu}=\partial_\sigma P_\gamma(t,0)^\mu_{\ \ \nu}\frac{d\gamma^\sigma}{dt}=X^\sigma\partial_\sigma P_\gamma(t,0)^\mu_{\ \ \nu}, $$ where all derivatives are taken at $t=0$, and then name $\partial_\sigma P_\gamma(t,0)^\mu_\nu$ as $\Gamma^\mu_{\sigma\nu}$, then I get back the usual coordinate formula for the covariant derivative and also that $\nabla_XY$ depends only on $X$ and not on $\gamma$, but this use of the chain rule would imply that $P$ is essentially a function on the manifold and $P_\gamma=P\circ\gamma$ and this doesn't seem right. I does give me a correct result though.
Question 1: How can I see that the covariant derivative defined this way is ultralocal in $X$? If what I did is correct, why is it correct? I am sure the parallel propagator is not even a "proper" two-point tensor on the manifold, can I just use the chain rule this way? If so, why?
Problem 2: If $X$ is a smooth vector field on $M$, let $P_X(p,t)$ be the parallel propagator $P_{\phi^X_{(\cdot)}(p)}(0,t)$, where $\phi^X_t(p)$ is the flow of $X$ (from point $p$, for time $t$).
The covariant derivative at $p$ can be then expressed as $$ \nabla_XY|_p=\left.\frac{d}{dt}[P_X(p,t)^{-1}Y_{\phi^X_t(p)}]\right|_{t=0}=\left.\frac{d}{dt}[P_X(\phi^X_t(p),-t)Y_{\phi^X_t(p)}]\right|_{t=0} .$$
For smooth vector fields $X,Y$ for which $[X,Y]=0$, the curvature tensor at $p$ can be supposedly expressed as $$ R(X,Y)Z=\left.\frac{d^2}{dtds}\left(P_X(p,t)^{-1}P_Y(\phi^X_t(p),s)^{-1}P_X(\phi^Y_s(p),t)P_Y(p,s)Z\right)\right|_{t=s=0}, $$ where $Z$ is a tangent vector at $p$, but I guess, if needed, can be a vector field defined around $p$ too.
The arguments of the propagators have been simplified by the commutation of flows, we go like this: $p\rightarrow \phi^Y_s(p)\rightarrow\phi^X_t(\phi^Y_s(p))\rightarrow\phi^Y_{-s}(\phi^X_t(\phi^Y_s(p)))=\phi^X_t(p)\rightarrow\phi^X_{-t}(\phi^Y_{-s}(\phi^X_t(\phi^Y_s(p))))=p$.
Question 2: How to evaluate this expression, to get the usual formula for the curvature tensor? I could attempt to write up some of my attempts here, but it would be pointless imo, because they are a bunch of incoherent brainstorming. That I do not understand the functional dependencies of $P$ properly, including the dependance on the parameters $s$ and $t$ makes me be completely stuck with this.
How to incorporate noncommuting vector fields into this?
Basically the reason I want this is that the most clear geometric meaning of the curvature tensor is that it expresses the path-dependence of parallel transport, infinitesimally. I would like to obtain this from actual, finite parallel transport as opposed to just postulating $R(X,Y)Z=\nabla_X\nabla_YZ-\nabla_Y\nabla_XZ-\nabla_{[X,Y]}Z$.
I realize this latter question may be way too long and detailed to be answered here, however any mention of textbooks that deal with the curvature tensor by parallel propagators (including proofs) would also be greatly appreciated!