Determine similarity of a sequence of [-1, 1] pairs?

33 Views Asked by At

Let's say there are 5 questions, and the answer to each one can be between -1 and 1.

Two people are answering these questions, so they each have a vector of their answers. Example:

[ -.3, 0, .95, -.94, .4 ] [ .3, .7, .9, 1, -.3 ]

How do I find how similar they are in total? Probably as a single number representing their total answer similarity (-1 being complete opposite and 1 being the same?)

Is it just the dot product? Or is there another/better way?

1

There are 1 best solutions below

6
On BEST ANSWER

The most intuitive metric I can think of for this context is the Euclidean (i.e. sum of squares) distance between the vectors. That is, the distance between $(x_1,\dots,x_5)$ and $(y_1,\dots,y_5)$ is given by $$ d = \sqrt{\sum_{k=1}^5 (x_k - y_k)^2} $$ Notably, the least $d$ can be is $0$, and the most $d$ can be is $2 \sqrt{5}$. If you'd like your measure to fall between $-1$ and $1$, you could define $$ D = 1 - \frac{d}{\sqrt{5}} = 1 - \sqrt{\frac{\sum_{k=1}^5 (x_k - y_k)^2}{5}} $$ we will have $D = 1$ when the vectors are identical, and $D = -1$ when the vectors are "opposite".