I've got an app in which the user taps a key to the beat of a music to mark out measures. If a song has a tempo of 120, for example (500ms / beat), the human-entered values might look like this:
$$u = [507, 989, 1549, 2005, 2525, 2952, 3420, 3978]$$
I'm trying to normalize these sets to the closest equal interval to the real beat--but of course, we don't know the real beat. I could average them, but I cannot assume that the error is evenly distributed on either side of the perfect value. (People are more likely to be a little late). So my idea is to find the interval $T$ that minimizes the difference between the guessed value and the entered value, like so:
$$\sum_{i=1}^{N} (|u_i - T * i|)$$
Is this the right approach? And if so, is there a better way than starting with the average and adding or subtracting to it bit by bit until I get the minimum value?
How about just taking the time from the first to the last and dividing by the number of beats less one? You are assuming the error is not too different at the start and the end, and it gets divided by the number of beats. So in this case we would guess the tempo is $\frac {3978-507}7\approx 496$ msec/beat. You don't really care if the error is equally distributed on each side. If the person is consistently 100 msec late, you will still get the beat exactly right. You care that there is not a consistent change from the first beat to the last. Maybe you should ignore the first couple beats because the person could be systematically later until it gets going.