Comparing ranking algorithms

80 Views Asked by At

If I have several different ranking algorithms and a 'correct' ranking, is there a good way of "scoring" the alternative rankings given by the algorithms against the reference one?

For example:

  • Real Ranking [1, 2, 3, 4, 5, 6] (for simplicity)
  • Alternative Ranking A [1, 2, 4, 3, 5, 6]
  • Alternative Ranking B [3, 5, 2, 1, 4, 6]

Out of those, I'd like to be able to say that Alternative Ranking A is better than Alternative Ranking B since it's much closer to the real ranking.

Is it perhaps a case of measuring the distance between the location of each item in the alternative and real (expected) rankings? Maybe squared distance or something?

1

There are 1 best solutions below

1
On BEST ANSWER

Yes, it boils down to defining some distance between [123456] and other permutations. Perhaps the most mathematically natural one is the number of inversions, which are pairs of numbers where the larger one is in front of the smaller. In your ranking A there is just one inverted pair: [43]. In ranking B there are six: [32], [31], [52], [51], [54], [21].

However, other distances may be more appropriate for your use case. For example, should [2,1,3,4,5,6] be considered as good as [1,2,3,4,6,5]? The number of inversions is the same. But, say, in sports, we care more about guessing who will be on the podium (and in what order) than about 5th and 6th place finishes. Also, which of [6,1,2,3,4,5] and [2,3,4,5,6,1] is better?

So you may decide to assign unequal penalty scores to inversions. Perhaps [21] should cost more than [65]. Maybe the penalty can be the reciprocal of the smaller number in the inversion. Or maybe the distance between inverted numbers can factor into the penalty.

Your idea about taking differences and squaring is not obviously bad, but I would find it difficult to fine-tune such an approach so that the results it produces match one's intuitive notion of quality.