I know that the Angular Distance is a proper metric but I'm struggling to find a reference that states that and proves all the properties for that distance. In the book Mining of Massive Datasets (Page 95) the authors briefly argue that the properties hold for what they call "Cosine Distance", which is just the angle between two vectors. However, I've read that the Angular Distance must be normalised in order to be considered a metric.
- Does the Angular Distance have to be normalised for it to be a metric?
- Can you point me to a reference that proves that the Angular Distance is a proper metric?
Metric
Let $X$ be an arbitrary set. A function $d: X \times X \rightarrow \mathbb{R}$ is called a metric on $X$, iff the following properties hold true for all $x, y, z \in X$:
Cosine Distance
Cosine distance is defined as:
$$d_{cos}(x, y) := 1 - \cos(\theta) = 1 - {\mathbf{A} \cdot \mathbf{B} \over \|\mathbf{A}\| \|\mathbf{B}\|} = 1 - \frac{ \sum\limits_{i=1}^{n}{A_i B_i} }{ \sqrt{\sum\limits_{i=1}^{n}{A_i^2}} \sqrt{\sum\limits_{i=1}^{n}{B_i^2}} }$$
Cosine distance is not a metric in $\mathbb{R}^n$, because the identity of indiscernibles does not hold true:
$$d(0.5, 1) = 1 - \cos(0°) = 1 - \frac{1}{1} = 0\text{, but } 0.5 \neq 1$$
To the question if cosine distance is a metric on $S_n = \{x \in \mathbb{R}^n: ||x|| = 1\}$:
\begin{align} x &= y\\ \Rightarrow d(x,y) &= d(x,x)\\ &= 1 - \sum_{i=1}^n x_i^2\\ &= 0 \end{align}
So that looks good.
We also know that $\cos(\theta) = 1 \Leftrightarrow \theta \in \{0°, 180°\}$. This means it still does not have the identity of indiscernibles attribute. We have to reduce the space to the non-negative unit sphere $S_n^+ = \{x \in \mathbb{R}^n: ||x|| = 1 \land x^{(i)} \geq 0\}$.