Deriving an evaluation metric for how out of order an ordering is compared to ground truth?

Question

Deriving an evaluation metric for how out of order an ordering is compared to ground truth?

304 Views Asked by Bumbble Comm At 10 May 2026 - 11:30

Let's say we have an unsorted list and we know what the sorted list looks like (ground truth). What are some metrics we could employ to quantify how badly unsorted our list is?

i.e. How bad is [4, 5, 2, 1, 3, 5] if our desired output is [1, 2, 3, 4, 5, 5]?

It's important to mention I am not evaluating this with sorting time complexity in mind. (I am not interested in how close this list is to the worst case for quicksort for example). I am only interested in comparing an ordering to the ground truth.

One metric could be edit distance, but one shortcoming there would be that [5, 2, 3, 4, 5, 1] would be as good an ordering as [1, 2, 4, 3, 5, 5].

What if maybe we should care about how far an element is from its correct position?

There is no specific use case in mind. The point of the question is learning about different approaches depending on different use cases?

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2020-06-28 16:43:35

The Wasserstein metric $W_1$ does something similar. Intuitively, if you have two differently distributed piles of dirt, it gives you the minimum amount of transported dirt weighted by the transport distance to transform the one pile into the other. So if you have a metric space $(X,d)$ and two integrable functions $f_1,f_2:X\to[0,\infty)$ (these represent the distributions) with $\int_X f_1\mathrm dx=\int_X f_2\mathrm dx$, then $W_1(f_1,f_2)$ is this minimum amount of transport you need to transform one pile into the other. The wikipedia link formalizes this idea, but it's for a pretty general case, which I don't want to discuss here. In your case, one could define some kind of "distinguishing" Wasserstein metric $W_1^d$ in the following way: Conceptualize your list as a function $f:\{1,\dots,n\}\to\mathbb N$. You can also replace $\mathbb N$ with any other set from which your list elements might be. Now define the functions

$f_n:\{1,\dots,n\}\to\{0,1\},~f_n(i)=\begin{cases}1&f(i)=n,\\0&\textrm{otherwise}\end{cases}$ for all $n\in\mathbb N$.

For your list $[4,5,2,1,3,5]$ this would translate to the lists

$f_1:\quad [0,0,0,1,0,0]\\ f_2:\quad [0,0,1,0,0,0]\\ f_3:\quad [0,0,0,0,1,0]\\ f_4:\quad [1,0,0,0,0,0]\\ f_5:\quad [0,1,0,0,0,1]$

It's essentially a list that just says "the original list contains/does not contain $n$ at this point". Now consider the sorted list $s$. We can define the Wasserstein metric $W_1(f_n,s_n)$ between $f_n$ and $s_n$ as the minimum distance the 1's in the list have to be transported in total to get from the list $f_n$ to the list $s_n$. For $n=2$, this would be $W_d(f_2,s_2)=1$, since the 1 has to be transported one place to the left. This corresponds to the fact that in the original list $f$, the number 2 is displaced by a distance of 1 from its sorted place. For $n=5$, it would be $W_d(f_5,s_5)=3$. You could either transport the first 5 to the second to last place (so a distance of 3), or you could transport the second 5 to the second to last place and the first 5 to the last place, which would involve a total movement of 5. The Wasserstein metric only cares about the shortest distance, though, so it's 3.

With all of this, we can define the distinguishing Wasserstein metric between $f$ and $s$ to be

$W_1^d(f,s)=\sum_{n\in\mathbb N}W_1(f_n,s_n)$.

This is the minimum total transport distance to transform the unsorted list to the sorted one. In your example, we would get $W_1^d(f,s)=W_1(f_1,s_1)+W_1(f_2,s_2)+W_1(f_3,s_3)+W_1(f_4,s_4)+W_1(f_5,s_5)=3+1+2+3+3=12$. It also addresses the shortcoming of edit distance. Take $g,h$ to be your lists which were supposed to illustrate this shortcoming, where $g$ has 1 and 5 swapped, while $h$ has 3 and 4 swapped. Then $W_1^d(g,s)=10>2=W_1^d(h,s)$, which is what you wanted.

**Bumbble Comm** · Answer 2 · 2020-06-28 16:58:22

Suppose you have lists of $n$ distinct elements. Fixing a 'base' ordering, there is a natural 1-1 correspondence between permutations on $n$ elements and orderings. In particular, the identity permutation corresponds to your 'true' sorted list. Thus, any metric on the group of permutations over a set of $n$ elements yields such a criterion for you: $d(\sigma, \textrm{id})$. Some examples are:

Spearman's Footrule: $$ d_S(\sigma, \pi) = \sum_{i=1}^n |\sigma(i) - \pi(i)| $$
Spearman's Rank Correlation: $$ d_{SR}(\sigma, \pi) = \sum_{i=1}^N \big(\sigma(i) - \pi(i)\big)^2 $$
Kendall's Tau: $$ d_{K}(\sigma, \pi) = \textrm{Min number of pairwise adjacent transpositions needed to take $\pi^{-1}$ to $\sigma^{-1}$} $$
Cayley Distance: $$ d_{C}(\sigma, \pi) = \textrm{Min number of transpositions taking $\sigma$ to $\pi$} $$

For many others see p. 112 of this paper.

Deriving an evaluation metric for how out of order an ordering is compared to ground truth?

There are 2 best solutions below

Related Questions in SEQUENCES-AND-SERIES

Related Questions in SORTING

Trending Questions

Popular # Hahtags

Popular Questions