I want to tell the number which act as the best approximate basic period (or wavelenght as pointed out by Eric) of a list of real numbers: e.g for {14, 21, 35} we should obtain 7 as the basic period, the same as for {17, 24, 38}(same numbers with an offset) or for {14.5, 21.3, 34.4}(same numbers with some error), etc. It is a kind of approximate GCD algorithm which ignores any offset.
The original use case was that of determining the rate of a heartbeat received from a computer, where beats are randomly skipped, based on their timestamps.
I've come to a possible solution which I'm stuck with: the solutions will minimize the variance of the remainder of the division of the values in the list by the chosen factor $T$ (that's the solution), and I have reached the following formula:
- being $v$ the list $\{v_1, v_2, \ldots, v_n\}$,
- $\operatorname{mean}_{\sin}(v, T)$ $= \frac{1}{n}\sum_{i=1}^n\sin(2\pi v_i/T)$
- $\operatorname{mean}_{\cos}(v, T)$ $= \frac{1}{n}\sum_{i=1}^n\cos(2\pi v_i/T)$
- $\operatorname{period}_{appeal}(v, T)$ $= \sqrt{\operatorname{mean}_{\sin}(v, T)^2 + \operatorname{mean}_{\cos}(v, T)^2} = \frac{1}{n}\sqrt{n + 2\sum_{i\lt j}^n{\cos{\frac{2\pi(v_i-v_j)}{T}}}}$ $\Rightarrow v \approx T·x + \frac{T}{2\pi}\operatorname{atan2}(\operatorname{mean}_{\sin}, \operatorname{mean}_{\cos}), x \in \mathbb{Z}$
Note that $\operatorname{period}_{appeal}$, as it is (rather obscurely) only dependent on the differences between the elements in the list, ignores a constant offset, that is,
- $\operatorname{period}_{appeal}(v, T)$ $= \operatorname{period}_{appeal}(w, T)$ where $w_i = v_i + k \ \forall{i}\forall{k}$
which is great for my original use case. However, we can tune the formula to find the approximate GCD instead, which assumes no offset is present,
- $\operatorname{gcd}_{appeal}(v, T)$ = $1 - \frac{1}{2}\sqrt{\operatorname{mean}_{\sin}(v, T)^2 + (\operatorname{mean}_{\cos}(v, T) - 1)^2}$ $\Rightarrow v \approx T·x, x \in \mathbb{Z}$
This is what they look like for $v = [14, 21, 35]$, period in blue, gcd in green
they both offer maximum appeal at 7, 7/2, 7/3,...,7/z, including at the $\infty$ limits (7/0), as $v_i$ are perfect multiples of 7 (and thus of 7/z). However when $v$ contains some noise, $v = [14.5, 21.3, 34.4]$, the functions start to differ
Here the optimum period is ~6.622 ($v \approx 6.622x + 1.3266$) while the optimum GCD is ~7.0178, and the fractions of the optimum value lose appeal in both cases.
And now my real question is,
Is there any chance of finding the solutions for these functions, that is the points with maximum appeal, whithout having to try every possible number?
If we assume that every $v_i$ corresponds to a different multiple of the period then the range of search for the solution can be limited to $(1, min(|v_i - v_j|)]$
Thanks for your answers.
edit:
You can play with the functions in CoCalc using the following code:
def mean_x(factor, values):
return sum([cos(2*pi*v/factor) for v in values])/len(values)
def mean_y(factor, values):
return sum([sin(2*pi*v/factor) for v in values])/len(values)
def calculatePeriodAppeal(factor, values):
mx = mean_x(factor, values)
my = mean_y(factor, values)
appeal = sqrt(mx^2+my^2)
return appeal
def calculateBestLinear(factor, values):
mx = mean_x(factor, values).n()
my = mean_y(factor, values).n()
y0 = factor*atan2(my,mx)/(2*pi).n()
err = 1-sqrt(mx^2+my^2).n()
return [factor*x + y0, err]
def calculateGCDAppeal(factor, values):
mx = mean_x(factor, values)
my = mean_y(factor, values)
appeal = 1 - sqrt((mx-1)^2+my^2)/2
return appeal
You can also find a fast approximate Java implementation in this StackOverflow answer.

