How should I measure the "fit" between two sequences?

93 Views Asked by Bumbble Comm At 27 Mar 2026 - 12:10

This question is for a machine learning application, which, surprisingly, is not stock price prediction. I apologize for being overbroad, but any ideas or even help framing the question would be greatly appreciated.

My application is generating sequences of numbers that I later want to compare against a sequence of actual measurements. For a set of generated sequences, I want a measure that allows me to rank them as they compare to the actual.

I have these intended uses for this data:

1) to select the "best" generated sequence from a set, to present to the client

2) to provide feedback to the machine learning algorithm

3) to communicate the quality of the algorithm's choices to the client

For example:

actual: [1000, 1200, 700, 900, 2100, 2500, 2700]

generated: [1000, 1500, 600, 700, 1200, 1100, 700]

generated: [1000, 2000, 2100, 3300, 4200, 6000, 6100]

generated: [1000, 1100, 1000, 900, 1100, 1200, 1400]

I can calculate the percent error between elements of the two sequences, and average that over the length of the sequence. (But that number can range widely -- it can get really big if the generated sequence is way off.)
I can also calculate what percent of the generated sequences fall inside some threshold for error percentage (e.g. error margin of +/- 30%). That gives me an overall measure of the quality of generated sequences.
I can average the variance between the elements in the two sequences, as the square root of the sum of the squares of differences between the two sequences. (That value can range widely, too.)
I have a "quality" measure that sums the logs of the absolute differences, and raises some constant to that value; i.e. 0.99 ^ z where z = the sum of log(abs(a - g)) for each a=actual and g=generated value. This is nice because it falls off exponentially as the difference increases, and is guaranteed to be between 0 and 1, which makes it easier to feed back into the machine learning algorithm.
In searching before I asked this, I saw another answer that brought up the Minkowski distance.

https://math.stackexchange.com/questions/426398/how-to-determine-distance-between-two-sequences-e-g-1-2-3-and-1-1-3

http://en.wikipedia.org/wiki/Minkowski_distance

If anyone has some good ideas, or can even help me to frame the question better, it would be a huge help.

Original Q&A

How should I measure the "fit" between two sequences?

Related Questions in STATISTICS

Related Questions in DESCRIPTIVE-STATISTICS

Trending Questions

Popular # Hahtags

Popular Questions