Is it possible to use the cross product as a measure of similarity?

771 Views Asked by At

I'm aware of the concept of cosine similarity to measure the similarity of two non-zero vectors:

$$\text{sim}(\mathbf{v}, \mathbf{w}) = \frac{\mathbf{v} \cdot \mathbf{w}}{\Vert \mathbf{v} \Vert \Vert \mathbf{w} \Vert}$$

However, is there a similarity metric using the cross product? That is:

$$\text{sim}(\mathbf{v}, \mathbf{w}) = \frac{\mathbf{v} \times \mathbf{w}}{\Vert \mathbf{v} \Vert \Vert \mathbf{w} \Vert}$$

This may be a bit of TMI, but for anyone who's curious about the context of this question, I was reading a research paper titled Visualizing and Understanding the Effectiveness of BERT (Hao et al., 2019 EMNLP-IJCNLP) and they claim to have used the cross product in the process of computing cosine similarity.

3

There are 3 best solutions below

0
On

First of all, note that the cross product is only defined for vectors in $\mathbb{R}^3$, which makes it quite limiting as a similarity measure.

Second, as Randall pointed out in the comments, $\mathbf{v}\times \mathbf{w}$ is a vector in $\mathbb{R}^3$, so you need to decide how to interpret a vector as a similarity.

Finally, recall that the formula for the cross product is $\mathbf{v}\times\mathbf{w}=\|\mathbf{v}\|\|\mathbf{w}\|\sin(\theta)\mathbf{n}$ where $\theta$ is the angle between $\mathbf{v}$ and $\mathbf{w}$, and $\mathbf{n}$ is the unit normal vector to the plane containing $\mathbf{v}$ and $\mathbf{w}$ in the direction given by the right-hand rule. Thus the magnitude of the cross product is $\|\mathbf{v}\times\mathbf{w}\|=\|\mathbf{v}\|\|\mathbf{w}\||\sin(\theta)|$, which is greatest when $\mathbf{v}$ and $\mathbf{w}$ are perpendicular, and smallest when they are parallel. This property seems rather strange for a similarity measure. For instance, this would imply $(1,0,0)$ is similar to $(0,1,0)$, which is similar to $(0,0,1)$, which is similar to $(1,0,0)$. Yet $(1,0,0)$ is not similar to itself.

0
On

Since $\Vert v\times w\Vert = \Vert v\Vert\Vert w\Vert\cdot |\sin\theta|$, I guess we could define "dissimilarity" or something, as: $$ \mathrm{dissim}(v,w) := \frac{\Vert v\times w\Vert}{\Vert v\Vert\Vert w\Vert} $$ The more perpendicular $v$ and $w$ are, the closer to $1$ the value will be.

It is closely related to the cosine similarity (because it is just sine of the angle instead of cosine), but we lose information by taking the numeric value of sine. More importantly, this only works in $3$ dimensions, whereas the dot product can always be used.

0
On

At the point in the paper you are describing, equation (3), the two vectors appearing in the equation are two dimensional. Meaning that this cross product is scalar valued. (It is the $z$-component of the usual cross product in $\Bbb{R}^3$ if we set the operation in the $xy$-plane.)

This scalar is close to $\pm 1$ if the two vectors are nearly perpendicular and close to $0$ if the two vectors are nearly parallel or antiparallel. In the first paragraph of section 3.2, they write "experimental results confirm that the optimization directions ... are orthogonal to each other". So the authors want a similarity measure that prefers orthogonality.

It is odd that they prefer a particular turning direction (sign of the cross product) for computing the $\alpha^\text{th}$ component of $d_i$, but ignore the direction (by squaring) in the $\beta^\text{th}$ component.