Is there a mathematical operation to ensure that two vectors are perfectly equal?

109 Views Asked by At

Let's said that I have a 3D matrix of integer of size 10x10x4, I want to know if along the 3rd dimension some vectors are perfectly similar (same number, same order).

So I could compare one by one each vectors (of size 4) with each others. But I'm using a computer and the time matters. So is there a mathematical operation to ensure that two vectors are perfectly equal. Or, same question others words, is there a mathematical operation that can link a unique value to a unique vector.

So typicaly for a vector of size 4 (base 10) the associated number should be between 1 and 10^4 (ideally).

2

There are 2 best solutions below

0
On

If by "number" you allow quaternions then of course to the vector $v=(v_1,v_2,v_3,v_4)$ you can associate $v_1+iv_2+jv_3+kv_4$, but that is not helpful at all...

If you know for instance that the entries are integers, then you can send $v$ to $v_1+\sqrt2 v_2+\sqrt3 v_3 + \sqrt5 v_4$ and this would be a bijection.

You have to consider if you are interested in $v=w$ or rather $|v-w|<10^{-8}$, because the two "equalities" are quite different to test. For instance, you cannot hash the vectors in order to test the second one.

Otherwise, if you think about it, the bit representation of the floating points for the components of $v$ are already a number associated to the vector. You just associate to $v$ the binary number obtained from the concatenation of the binary representations of $v_1,\dots,v_4$.

Given that the vectors are very short (only 4 entries), I don't see any real benefit of attempting anything different from pairwise comparison. You can use something like int memcmp(const void *s1, const void *s2, size_t n); (manpage) for that or trivially code your own.

If you know something more about the spacial distribution of your vectors, maybe you can try something like hextrees in order to reduce the number of necessary comparisons.

See also here for ideas on how to partition your vectors. For instance, given $w\in\mathbb R^4$, you can partition your vector in the two families $W_-=\{v:v\cdot w<0\}$ and $W_+=\{v:v\cdot w>0\}$. Now you only need to check for identity among the two families and not across them. So if you are able to find a good $w$ that splits your vectors in two families approximately equinumerous, you pass from $n^2$ tests to $n^2/2$.

Edit Taking further the previous idea of computing $w\cdot v$, you can in fact compute 10x10 matrix of these scalar product and then have to worry only about entries which are the same. This can reduce drastically the number of comparisons.

2
On

What you are looking for is a hash function in the general sense. Let a vector be $(a_0,a_1,a_2,a_3)$. If you know all the components are positive and less than some number $b$, you can just compute $a_0+ba_1+b^2a_2+b^3a_3$ and compare these. That gets you what you want, a unique number for each vector as long as they are what you expect. Whether it is worth computing this for each vector is up to you. I would be surprised if this is the slow point for your code. Of course, if $b$ is the size of the largest number that fits in an int you haven't saved any storage at all.

If you don't know the size of your vectors but all the components are positive whole numbers you can use the Cantor Pairing function. You can find many mentions of it on the site. You can pair $a_0 $ and $a_1$, $a_2$ and $a_3$ and finally the two pairs. This is more computation but handles numbers of any size. If you have negative numbers you need to make them positive some what or adapt the function.