Obtaining the euclidean distance from the cosine similarity

Question

Obtaining the euclidean distance from the cosine similarity

523 Views Asked by Bumbble Comm At 02 Apr 2026 - 11:37

In an algorithm improvement problem, I was thinking that the cosine similarity along with the euclidean distance could be obtained in a way that the number of times it needs to calculate a square and a square root is reduced.

The idea was to store in a database a set of vectors as tuples that contains the unit vector components, the magnitude and the square of the magnitude.

$$ \mathbf{V}^n = [\hat{v_1}, \hat{v_2} \dots \hat{v_n}, |V|, |V|^2]$$ So the database will contain the following data.

$$ \mathbf{V}^n_m= \begin{array}( (\hat{v_{11}}, \hat{v_{12}} \dots \hat{v_{1n}}, M_1, S_1)\\ (\hat{v_{21}}, \hat{v_{22}} \dots \hat{v_{2n}}, M_2, S_2)\\ \vdots \\ (\hat{v_{m1}}, \hat{v_{m2}} \dots \hat{v_{mn}}, M_m, S_m) \end{array}$$

Where

$$ M_m = |V_m| $$ $$ S_m = |V_m|^2$$

In this way, for any two given vectors, I can obtain the cosine similarity as the dot product of the corresponding unit vectors:

$$ \cos_\ \theta_{V_1V_2} = \hat{v_1} \cdot \hat{v_2} = \hat{v_{11}}\hat{v_{21}} + \hat{a_{12}}\hat{b_{22}}+ \dots +\hat{a_{1n}}\hat{b_{2n}} \\ \cos_\ \theta_{V_1V_3} = \hat{v_1} \cdot \hat{v_3} = \hat{v_{11}}\hat{v_{31}} + \hat{a_{12}}\hat{b_{32}}+ \dots +\hat{a_{1n}}\hat{b_{3n}} \\ \vdots \\ \cos_\ \theta_{V_1V_n} = \hat{v_1} \cdot \hat{v_n} = \hat{v_{11}}\hat{v_{n1}} + \hat{a_{12}}\hat{b_{n2}}+ \dots +\hat{a_{1n}}\hat{b_{nn}} \\ \vdots $$

Now, intuitively, I just thought that having the cosine, I could obtain the euclidean distance by means of the law of cosines, thus:

$$ \mathbf{d_{V_1V_2}} = \sqrt{S_1+S_2-2M_1M_2\cos\ \theta_{V_1V_2}} \\ \mathbf{d_{V_1V_3}} = \sqrt{S_1+S_3-2M_1M_3\cos\ \theta_{V_1V_3}} \\ \vdots \\ \mathbf{d_{V_1V_n}} = \sqrt{S_1+S_n-2M_1M_n\cos\ \theta_{V_1V_n}} \\ \vdots $$

In this way, the cost to calculate both metrics between any two given vectors is reduced to a single square root calculation. Now, I said intuitively because I can visualize the law of cosines in a 2-dimension and 3-dimension space but I am not sure about higher dimensions.

Is there a flaw in the reasoning? Is the law of cosines valid for dimensions higher than 3?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2019-07-10 12:42:37

Your reasoning about cosines is fine; I think there's quite a bit of math in which we actually define the angle between two vectors by taking the arc cosine of their inner product.

I think that answers your question; the rest of what follows is essentially a comment and suggestion relative to your proposed method.

Let's see what happens if you store the data this way: $$ \mathbf{V} = \left( \begin{array}{c} (v_{11}, v_{12}, \ldots v_{1n}, S_1)\\ (v_{21}, v_{22}, \ldots v_{2n}, S_2)\\ \vdots \\ (v_{m1}, v_{m2}, \ldots v_{mn}, S_m) \end{array} \right),$$ where $S_i = \lVert \mathbf V_i\rVert^2 = v_{i1}^2 + v_{i2}^2 + \cdots v_{in}^2$ for each $1 \leq i \leq m.$ Then for any $1 \leq i < j \leq m,$

$$ \lVert \mathbf V_i\rVert \lVert \mathbf V_j\rVert \cos \theta_{V_iV_j} = \mathbf V_i \cdot \mathbf V_j = v_{i1}v_{j1} + v_{i1}v_{j1} + \cdots + v_{i1}v_{j1}. $$

Instead of performing $n + 2$ multiplications and $n - 1$ additions to obtain $M_1 M_2 \cos \theta_{V_1V_2}$ from $M_1,$ $M_2,$ and the $2n$ components of the unit vectors $\hat {\mathbf V}_1$ and $\hat {\mathbf V}_2,$ you could perform $n$ multiplications and $n - 1$ additions to obtain $\lVert \mathbf V_i\rVert \lVert \mathbf V_j\rVert \cos \theta_{V_1V_2}$ from the $2n$ components of the original vectors ${\mathbf V}_1$ and ${\mathbf V}_2.$ Then every time you compute

$$ \mathbf{d_{V_iV_j}} = \sqrt{S_i + S_j - 2 \mathbf V_i \cdot \mathbf V_j} $$

it would cost you $n + 1$ multiplications (including the multiplication by $2$), $n$ additions, one subtraction, and one square root, rather than $n + 3$ multiplications, $n$ additions, one subtraction, and one square root. Moreover, you save the cost of computing $M_i,$ $M_j,$ and all the $\hat v_{ik}$ and $\hat v_{jk}$ components in the first place.

Granted, if the number of vectors is much larger than the number of dimensions, the number of operations saved is a small fraction of the total cost of computing all the distances, but the point is that storing vectors in the form $$ (\hat v_{11}, \hat v_{12}, \ldots \hat v_{1n}, M_1, S_1)$$ rather than $$(v_{11}, v_{12}, \ldots v_{1n}, S_1)$$ saves you nothing (and actually incurs some added cost) for the purpose of this exercise.

That's not to say there isn't some application where precomputing all the values of $\lVert \mathbf V_i\rVert$ would lower the cost of computation. I've considered that trick myself from time to time. It just doesn't pay off for this particular application.

Obtaining the euclidean distance from the cosine similarity

There are 1 best solutions below

Related Questions in VECTOR-SPACES

Related Questions in ALGORITHMS

Related Questions in VECTORS

Trending Questions

Popular # Hahtags

Popular Questions