There have been at least two discussions in this forum about why a (pseudo) metric is defined in the quotient of a metric space by: $$ d([x],[y]) = \inf \{ d(p_{1}, q_{1}) + \ldots, d(p_{n}, q_{n}) \} $$ where $[p_{1}] = [x]$, $[q_{n}] = [y]$ and $[q_{i}] = [p_{i+1}]$ for $i = 1, \ldots, n-1$. I understood that this is defined in order to guarantee the triangle inequality. I wasn't able, myself, to verify it, so some help here would be appreciated.
Nevertheless, the main point of my question is the following: in Anatole Katok's book A First Course in Dynamics, proposition 2.6.7 asserts that the far more natural function $$ d([x],[y]) = \min \{ |b-a| : a \in [x], b \in [y] \} $$ is actually a metric in $\mathbb{R} / \mathbb{Z}$. That left me wondering: is there some general extra condition ensuring that the more natural function $\inf$ of distances over all representatives is a metric in the quotient? What's special about this example that doesn't work in the general case? Reading Katok's proof, I had a guess that it lies in the fact that there is a minimum-attaining representative in this case, which can be fixed.
I prefer to think of quotients in terms of pseudo-metrics on the original space. A pseudo-metric is just like a metric except it may be zero for some pairs of distinct elements. Every pseudo-metric $\rho$ induces an equivalence relation, namely $x\sim y$ iff $\rho(x,y)=0$. It also defines a metric on the equivalence classes: $d([x],[y])=\rho(x,y)$, which is easily seen to be independent of the choices of $x,y$. Thus, the process of defining a metric on a quotient amounts to building a pseudo-metric on the original space which induces the given equivalence relation.
Let's look at the general case first: we have a metric space $(X,d)$ and equivalence $\sim$ on this space. We want a pseudo-metric $\rho$ that induces $\sim$. The first thing that comes to mind is $$\tilde \rho(x,y) = \begin{cases} 0\quad & \text{if } x\sim y, \\ d(x,y) \quad & \text{otherwise}\end{cases}\tag{1}$$ By definition, $\tilde \rho$ is symmetric, nonnegative, and induces the correct equivalence relation. The issue is the triangle inequality. If $\tilde \rho$ happens to satisfy it, we are done: $\rho=\tilde \rho$. But there is no reason why it should. So, we use the chain construction: $$\rho(x,y) = \inf \sum_{i=1}^n \tilde \rho(p_{i},p_{i-1}) \tag{2}$$ where the infimum is taken over all chains connecting $x$ to $y$, namely finite sequences of points $p_i$ such that $p_0=x$ and $p_n=y$. The triangle inequality for $\rho$ follows from the fact that chains can be concatenated: given $x,y,z$, we can build a chain from $x$ to $z$ from two chains: from $x$ to $y$, and from $y$ to $z$. This implies $\rho(x,z)\le \rho(x,y)+\rho(y,z)$.
I emphasize that (2) is not special to the quotient construction; it's just a way to force the triangle inequality to hold. It's as standard and natural as forcing nonnegativity by taking the absolute value of a number. First, convince yourself that (2) always ensures the triangle inequality; then consider what it does when applied to $\tilde \rho$ defined by (1). (It produces the general quotient metric in your post.)
There is a valid concern that $\rho $ defined by (2) may induce a larger equivalence relation than what we started with. This sometimes happens.
Now consider the special case of the quotient by isometry group $G$. I claim that in this case, the infimum in (2) can be just as well taken over the chains with $n=2$. First, notice that the chains where three consecutive elements are equivalent ($p_{i-1}\sim p_{i}\sim p_{i+1}$) are not needed: we can drop the middle term $p_i$ from the chain. Let $i$ be the smallest index such that $p_i\sim p_{i-1}$; suppose $i<n$. This means there is an isometry $f\in G$ such that $f(p_i)=p_{i-1}$. The chain $$p_0,\dots, p_{i-1}, f(p_{i+1}) ,\dots, f(p_{n}), p_n$$ has the same sum of $\tilde \rho$ values, because $$\tilde \rho (p_{i-1}, f(p_{i+1}) ) = \tilde \rho (f(p_{i}), f(p_{i+1})) = \tilde \rho(p_i, p_{i+1}) $$ Also, $p_{i-1} \not\sim f(p_{i+1})$.
Repeating the above, we create a new chain in which $p_i\not\sim p_{i-1}$ for $i<n$. This implies $$\sum_{i=1}^{n-1} \tilde \rho(p_{i},p_{i-1}) = \sum_{i=1}^{n-1} d(p_{i},p_{i-1})\ge d(p_{n-1},p_0)$$ Thus, the chain $p_0,p_{n-1},p_n$ has the smaller or equal sum of $\tilde\rho$ values than the chain we started with.
Conclusion: when equivalence relation comes from the quotient by isometries, the definition of $\rho$ can be simplified to $$\rho(x,y) = \inf \{ d(x,z) : z\sim y\} \tag{3}$$
I did not need infimum-attaining representatives to simplify (2) into (3). However, they are useful for showing that $\rho(x,y)>0$ when $x\not\sim y$; that is, that the pseudometric will induce precisely the equivalence relation we want, and not something larger.