On a Riemannian manifold, a geodesic $\gamma_{pq}$ between two points $p$ and $q$ locally minimizes the length between these two points. I would generally expect this to imply that if $\delta \gamma_{pq}$ is some variation of the geodesic which preserves the end points, the first order variation of the length must vanish. However, I have confused myself a bit with what seems to be an obvious counter example: consider $\mathbb{R}$ with the standard Euclidean metric. Then, between two points, there is either the straight line between those points, or a curve which "overshoots" and returns back to the end points. If the overshoot is up to $q + \epsilon$, then the variation of the length is $2\epsilon$, which is first order!
If the back-tracking is a problem, one can do the same thing in two-dimensional Euclidean space with a curve which goes from $p$ to $s$ and then from $s$ to $q$, having $s$ be nearly co-linear with $p$ and $q$ and "behind" either $p$ or $q$.
Is there some reason that these variations of the geodesic are "not valid" from the perspective of the variational calculus? Or is my starting assumption simply incorrect?
This is a great question. You have to set this up more precisely as a variation in the function space. As you suggested, let's work in $\Bbb R$ with $p=0$ and $q=1$. We have the base curve given by $g(t)=t$, $0\le t\le 1$. We want to consider variations $g+\epsilon\eta$ with $\eta(0)=\eta(1)$ and $\eta$ piecewise $C^1$. You can check that the scenario you describe is given by $$g_\epsilon(t)=(g+\epsilon\eta)(t) = \begin{cases} \frac{1+\epsilon}{1-\epsilon/2}t, & 0\le t\le 1-\epsilon/2 \\ -2t+3, & 1-\epsilon/2\le t\le 1\end{cases}.$$ We proceed at constant speed from $0$ to $1+\epsilon$ and then back from $1+\epsilon$ to $1$. This is a perfectly nice piecewise-linear function. But what is $\eta$? Simple algebra gives us $$\eta(t) = \begin{cases} \frac3{2-\epsilon}t, & 0\le t\le 1-\epsilon/2 \\ \frac3\epsilon(1-t), & 1-\epsilon/2 \le t\le 1\end{cases}.$$ And now you see the problem. The variation "vector field" needs to blow up to accommodate your variation. That is, your innocent variation is not at all an $\epsilon$-linear perturbation in the function space.
(Recall that the usual set-up is to consider $L(g)$ and then require that the directional derivatives $D_\eta L(g) = \frac d{d\epsilon}\Big|_{\epsilon=0} L(g+\epsilon\eta)$ vanish for all possible $\eta$ with $\eta(0)=\eta(1)=0$.)