I am trying to get a feeling for how to design metrics. Here I want to ask about designing a metric from Dynamic Time Warping distance (DTW).
Let's say I want to develop a metric for continuous sequences similar to DTW. As it is well known DTW is not a metric, since it does not fulfill the triangular inequality and sequences which are different might still have distance of 0 to each other. Which design patterns / recepiece / methods / tricks exist, to make sure, the DTW does become a metric? How should I approach the issue (without being a genius or waiting for a lucky epiphany)?
One way to approach this, would be that any warping, adds some distance between the two sequences. So only if I need to warp one of the sequences to and receive 0 distance regarding the sequence values, I still get distance, because of the warping.
Now I am only left with the fact that the triangular inequality might not be fulfilled. As a reminder: the total DTW distance is the sum of squares of all the distances of the values of each match. Here I could suggest that instead of using the squares of difference for a match, one could simply just use the absolute value of the difference.
I am not sure whether I got a metric now. Is this approach sensible? Is it a metric? How do I discuss this?
Ok, I will bite.
Why make it a metric? There is a strong case for needing non-metric measures in some cases.
One use of triangular inequality is speeding up search, but DTW search is VERY fast just using lower-bounding-search (UCR-Suite).
As an aside, almost everyone use cDTW, which becomes a metric in the limit as the warping window gets smaller.
https://www.cs.unm.edu/~mueen/DTW.pdf