I'm studying the Jacobi iterative method, which solves a linear system of equations in the form $\mathbf A x = b$. As an example, I considered this simple C implementation, where the programmer seems to have chosen the Euclidean distance between two consecutive results $||x^{(k+1)} - x^{(k)}||_2$ (lines 73-78 of the source code) as a convergence metric.
Although the implementation seems to work with this metric, I wonder why one would choose it over the residual error $||b - \mathbf A x||$ or any other metric, other than being less computationally expensive? Do some metrics provide more "strong" convergence guarantees than others or something like that? Are some of them more suitable for some algorithms (or some types of inputs) more than others? How should the tolerance threshold be chosen w.r.t the chosen metric?
Convergence is guaranteed if your system matrix $A$ is such that $$\rho(I - D^{-1}A) < 1$$ with $D$ being the diagonal matrix constructed from the diagonal elements of $A$. An easier to check criteria is to examine whether $A$ is diagonally dominant. In that case, the method is also guaranteed to converge: $$\lim_{k\to \infty} x^{(k)} = x \text{ such that } Ax = b.$$
Note that the speed of convergence is irrespective of the break-criterion you use - you always perform the same computations and get the same iterates $x^{(k)}$.
$\Vert x^{(k+1)} - x^{(k)} \Vert_2 $ is probably used since it is somewhat a more economic criterion - if your progress is below some certain threshold, stop the computation. The other, $\Vert b - Ax^{(k+1)} \Vert$ is of course more rigorous and allows you to directly evaluate the error.
In my opinion, it is very advisable to normalize errors, i.e., use $\Vert x^{(k+1)} - x^{(k)} \Vert_2 / \Vert x^{(k+1)} \Vert $ or $\Vert x^{(k+1)} - x^{(k)} \Vert_2 / \Vert x^{(k)} \Vert $ to be independent of the actual values of $x^{(k)}$. Similarly, $\Vert b - Ax^{(k+1)} \Vert/ \Vert Ax^{(k+1)} \Vert$ or $\Vert b - Ax^{(k+1)} \Vert/ \Vert b \Vert$ are more general choices, allowing you to pick one threshold $\epsilon$ regardless whether entries of $b$ are of order $10^9$ or $10^{-5}$.