I was having a discussion with friends and at some point we decided to make predictions on a quantity (value of daily new covid cases in a specific area). We all made our predictions and then we looked at the real value. Let's say the real value was $15$, Alice predicted $21$, Bob $11$, and other friends above $21$. We said Alice "won", and then I jokingly said but if you take the relative error then Bob won, because $$\frac{|21-15|}{21} \approx 0.28 < \frac{|11-15|}{11} \approx 0.36$$ In other words, you take the absolute difference and you divide it by the prediction (not the true value).
Edit: As an answer pointed out, and I confirmed, relative error is defined as the absolute error divided by the true value, not the prediction. In this case, I do not want to take the relative error, because it will produce the exact same verdicts as the absolute error. Relative error is useful to compare predictions for different targets (i.e., different real values), but in our case we have a single real value, so in essence it's not different to the absolute error. Let's call my metric (where I divide with the prediction value) Thanassis's Metric (TM). Trademarking it would be TM™ :) Smaller TM means the prediction is better (so it's another error metric).
My friends protested: "You can't do that! This does not make any sense!". Even though I made the argument in jest, I was surprised by the claim that this does not make sense. I tried to argue that when we are making predictions it's fine to take TM. At least, I do it all the time, it seems intuitive to me. I tried to give some examples and after a few attempts we settled on this: Suppose you see an aerial photo of a crowd of $2000$ people and you are called to make a prediction of how many people you see in the photo. A prediction of $100$ is far far worse to me than a prediction of $4000$, even though the absolute error (and the relative error) is smaller in the first case. When I try to explain the rationale behind it, I end up with the following: When we are making predictions that span several orders of magnitude (and this is often the case with predictions), we are concerned about getting the order of magnitude right. Think about it this way: this person that guessed $100$ in my example, they could have guessed $100\,000$ in another case (when the target is again $2000$), so we are not capturing this kind of error if we are just taking the absolute difference.
I guess instead of taking the TM we could have taken the absolute error of the logs $$|\log(\text{target}) - \log(\text{prediction})|$$
The logs difference metric is a direct "translation" of my rationale (we are interested about the orders of magnitude). Interestingly, I see that the logs method does not yield the same verdict on my initial example (target $15$, predictions $11$ and $21$). $11$ is the better prediction. But it does yield the same verdict in the more extreme example. Maybe TM is indeed a bad metric to use and the difference of logs is the right metric to use for the thing I want to achieve.
In any case, these are my questions (all falling under a general question on rating the accuracy of predictions):
- How would you justify/refute the usage of TM on rating predictions the way I described it above?
- How would you justify/refute using the difference of the logs for the same purpose?
- Do you know of any real world examples that are using either metric?
Edit 2: I partly answered my own question below, by refuting the TM metric and providing some graphs of the different errors to support taking the "relative difference" as a metric. I would love to see more thoughts on the matter or examples when different metrics are used.


I have never seen relative error being divided by the prediction and not the target. The standard definition of relative error is divided by the target value. If you divide the difference by the prediction, you create a bias toward prediction, which means it's not quite "accuracy" that you want (at least not in the traditional sense). For example, if the target is 10 and the predictions are 8 and 12. Then 8 is the more accurate prediction in your definition, which doesn't make sense.
Also, in your example, Bob predicted 11 and Alice predicted 21. So Bob won in the traditional sense. Suppose I make the prediction that the value is 100000000. Then I would have won by your definition.