Minimum distance between sequence a and all permutations of another sequence b

301 Views Asked by At

Let $a$ and $b$ be finite sequences of length $n$, i.e. $a=(a_1,a_2,...,a_n), b=(b_1,b_2,...,b_n)$. I want to calculate the minimum of the distances (in an Lp norm) between $a$ and all permutations of $b$. Let $\Pi_n$ be the space of all permutations of $n$ numbers, then I am looking for

$d = \text{min}_{\pi\in\Pi_n} ||a - (b_{\pi(1)},b_{\pi(2)},...,b_{\pi(n)})||_p$

Is there an efficient method or optimization algorithm (something better than trying out all $n!$ permutations of b) to calculate $d$ and find the corresponding permutation $\pi$?

2

There are 2 best solutions below

2
On BEST ANSWER

If $p\ge 1$, then the minimum is attained by rearranging $b$ so that it is "order isomorphic" to $a$, meaning that $b_i-b_j$ has the same sign as $a_i-a_j$ for all $i,j$. To see this, suppose that $a_i<a_j$ and $b_i<b_j$. Using the convexity of the function $f(x)=|x|^p$, you can show that $$ |a_i-b_i|^p+|a_j-b_j|^p\le |a_i-b_j|^p+|a_j-b_i|^p $$ Therefore, if $a_i$ is matched with $b_j$ and $a_j$ with $b_i$, it is more efficient to swap $b_i$ and $b_j$. This means if $b$ is optimal, it must have no such out-of-order matching.

When $p<1$, the function $|x^p|$ is no longer convex, so the same local swapping rule does not always hold. In fact, it is sometimes reversed. Here are some observations:

  • If $a_1<a_2<b_1<b_2$, then it is more efficient to match $a_1$ with $b_2$ and $a_2$ with $b_1$ (this is the opposite of the $p\ge 1$ case).

  • If $a_1<b_1<b_2<a_2$, then it is more efficient to match $a_1$ with $b_1$ and $a_2 $ with $b_2$ (this is the same as the $p\ge 1$ case).

  • If $a_1<b_1<a_2<b_2$, then the optimal matching depends on the values of $a_1,a_2,b_1$ and $b_2$. For example, when $a_2-b_1$ is very small, it is optimal to pair $a_1$ with $b_2$, but as $a_2-b_1$ increases to $\infty$ while $b_1-a_1$ and $b_2-a_2$ remain constant, then eventually the opposite matching is better.

I am not sure if these observations translate into an algorithm.

1
On

An attempt to prove the inequality from Mike's answer. Feel free to correct this proof or fill in any gaps; I suspect that it's flawed.


Claim: if $a_1<a_2$ and $b_1<b_2$ are positive reals and $f$ is a non-negative, increasing, convex function over $[0,\infty)$, then $$ f(|a_1 - b_1|) + f(|a_2 - b_2|) \leq f(|a_1 - b_2|) + f(|a_2 - b_1|) $$

Proof: We note that $g(x) = f(|x|)$ is convex over $\Bbb R$. So, for any $k>0$ and $a,b \in \Bbb R$, we have $$ f(a - k) + f(b+k) \geq f(a) + f(b) $$ (This step seems shaky. I think this holds if $a\leq b$ but not in general, which means that perhaps the rest of the proof needs to be broken into several cases)

It follows that $$ f(|a_1 - b_2|) + f(|a_2 - b_1|) = \\ f(|(a_1 - b_1) - (b_2 - b_1)|) + f(|(a_2 - b_2) + (b_2 - b_1)|) =\\ g((a_1 - b_1) - (b_2 - b_1)) + g((a_2 - b_2) + (b_2 - b_1)) \geq\\ g(a_1 - b_1) + g(a_2 - b_2) = \\ f(|a_1 - b_1|) + f(|a_2 - b_2|) $$ as desired.