Is there a distance metric for dot product similarity that preserves the ordering of nearest neighbors?

2.1k Views Asked by At

The dot product and cosine similarity measures on vector space are frequently used in machine learning methods. However, efficient data structures and algorithms often require a metric space distance function.

For cosine similarity, the angular distance defined as

$\displaystyle d(x,y)=\frac{\cos^{-1}(s(x,y))}{\pi}$ where $\displaystyle s(x,y)=\frac{x \cdot y}{||x||||y||}$

is a valid distance metric that preserves the ordering of nearest neighbors for any point. e.g for a set of points P and any point $p\in P$, if we order all points in P by its cosine similarity to p in descending order, then we would get the same thing compared to the ascending order of all points in P by its angular distance to p.

Is there a distance metric for dot product that has the same property?

1

There are 1 best solutions below

0
On BEST ANSWER

No, unless you restrict yourself to vectors of particular norm $\|x\|=r$ (in which case the dot product similarity measure is essentially the same as cosine similarity). The problem is that for any two vectors $x,y$ that do not point in opposite directions, one can find a vector $z$ such that both $x\cdot z$ and $y\cdot z$ are very large (indicating large similarity).

To be precise, suppose that there is a metric $d$ of the form $d(x,y) = \varphi(x\cdot y)$ for some decreasing $\varphi$. Given $x,y$ that do not point in opposite directions, let $$z = M\left(\frac{x}{\|x\|} + \frac{y}{\|y\|} \right)$$ where $M$ is to be chosen. Note that $$x\cdot z = M \|x\|\left ( 1 + \frac{x\cdot y}{\|x\|\|y\|}\right)>0$$ which can be arbitrarily large when $M$ is large. And similarly for $x\cdot y$. The metric $d$ would have to satisfy $$ d(x,y) \le d(x,z)+d(y,z) = \varphi(x\cdot z)+\varphi(x\cdot z) $$