Compare time of day using Euclidean distance

55 Views Asked by At

I have $t_1, t_2, ..., t_N$, where each $t \in [0, 1[$ is a time during the day (i.e. 0.01 is right after midnight and 0.99 is right before midnight). I want to compute the distance between these time of day such that I can identify times that are close to each other.

The distance should wrap around midnight such that times right before midnight and right after midnight are close (constraint 1 below). The distance between two points should also be such that a time twice as far away should have twice the distance (constraint 2 below).

The tricky part is that because this time comparison will be part of an existing nearest neighbor search solution, the distance metric has to be Euclidean distance (L2).

To put it more mathematically:

  • $t \in [0, 1[$
  • $f(t)$ maps $t$ to a vector of dimension $f_d$
  • $d(f(t_1), f(t_2))$ is the Euclidean distance between $f(t_1)$ and $f(t_2)$

These are the constraints we're trying to achieve:

  1. $d(f(t_1 + \Delta, 1), f(t_2 + \Delta, 1)) = c$ for all $\Delta \in [0, 1]$, where $c \in \mathbb{R}$ is a constant
  2. $d(f(t), f(t + k \cdot \Delta)) = k \cdot d(f(t), f(t + \Delta))$

These two constraints could probably expressed in a simpler way. If so, feel free to try to reformulate them!

Own attempt

I tried mapping the time onto the unit circle, i.e. $f(t) = (\sin(t), \cos(t))$. This fulfills constraint 1 as the Euclidean distance between two points on the unit circle is the chord length, which only depends on the angle and thereby wraps around the midnight point at 1. However, this approach doesn't fulfill constraint 2 above as a time twice as far away does not have twice the distance, e.g. $t=0.25, \Delta=0.25, k=2$ such that $d(f(0.25), f(0.75) \ne 2 \cdot d(f(0.25), f(0.5))$.

2

There are 2 best solutions below

1
On

This can't be done.

First of all, you don't actually want constraint 2 as stated, because of the wraparound: you want the distance between $0$ and $3\cdot \frac25$ to equal $\frac15$, not $3\cdot\frac25=\frac65$.

But even confining to situations where you want this literal statement, one can derive a contradiction. For example, consider the images $f(0),f(\frac25),f(\frac45),f(\frac65)=f(\frac15)$:

  • The (Euclidean) distance between $f(0)$ and $f(\frac25)$ should equal the distance between $f(\frac25)$ and $f(\frac45)$, and the distance between $f(0)$ and $f(\frac45)$ should equal twice this distance. These constraints force $f(\frac25)$ to be the midpoint of the segment from $f(0)$ to $f(\frac45)$ (by the law of cosines, if you like).
  • Similarly, $f(\frac45)$ is forced to be the midpoint of the segment from $f(\frac25)$ to $f(\frac65)=f(\frac15)$.
  • But again, $f(\frac15)$ is forced to be the midpoint of the segment from $f(0)$ to $f(\frac25)$.

And these three facts are incompatible with one another. (Well, if $f$ is a constant function then I guess it works, but I doubt that's what you want.)

3
On

If we consider $0 \le t_1 \le 1$ and $0 \le t_2 \le 1$, with $0$ and $1$ indicating the same position, then $$d(t_1, t_2) = \min\bigr( \lvert t_2 - t_1 \rvert,~ \lvert t_2 - t_1 + 1 \rvert,~ \lvert t_2 - t_1 - 1 \rvert \bigr), \quad 0 \le d \le \frac{1}{2}$$ is the wraparound Euclidean distance between $t_1$ and $t_2$. Note that this can be calculated using $$\begin{aligned} t_{+} &= 1 + t_2 - t_1 \\ d(t_1, t_2) &= t_{+} - \lfloor t_{+} \rfloor \\ \end{aligned}$$ where $\lfloor~\rfloor$ denotes truncation or rounding down (say, floor()).

Another way to define the same function is $$d(t_1, t_2) = \frac{1}{2\pi}\arccos\biggr(\cos\Bigr(2\pi (t_2 - t_1) \Bigr)\biggr)$$ which has the interesting feature that $$\cos\varphi = \cos\Bigr(2\pi (t_2 - t_1)\Bigr)$$ where $\varphi$ can be considered an analog of the "angle" between two one-dimensional wraparound Euclidean values $t_2$ and $t_1$.