Let $x \in \mathbb{R}^p$ denote a $p$ dimensional data point (a vector). I have two sets $A = \{x_1, .., x_n\}$ and $B = \{x_{n+1}, .., x_{n+m}\}$. So $|A| = n$, and $|B| = m$. Given $k \in \mathbb{N^*}$, let $d_x^{(A, k)}$ denote the mean Euclidean distance from $x$ to its $k$ nearest points in $A$; and $d_x^{(B, k)}$ denote the mean Euclidean distance from $x$ to its $k$ nearest points in $B$.
I have the following algorithm:
- $A' = \{ x_i \in A \mid d_{x_i}^{A, k)} > d_{x_i}^{(B, k)} \}$ ... (1)
- $B' = \{ x_i \in B \mid d_{x_i}^{A, k)} < d_{x_i}^{(B, k)}$ ... (2)
- A = $\{ x_i \in A \mid x_i \not\in A' \} \cup B'$ ... (3)
- B = $\{ x_i \in B \mid x_i \not\in B' \} \cup A'$ ... (4)
- Repeat (1), (2), (3) and (4) until: (no element moves from $A$ to $B$ or from $B$ to $A$, that is A' and B' become empty) or (|A| $\leq$ or |B| $\leq$ 1)
Do this algorithm terminates, and if it so, is it possible to easily prove it ?
Note: the $k$ nearest points to $x$ in a set $S$, means: the $k$ points (others than $x$) in $S$, having the smallest Euclidean distance to $x$.
Take $p=1$ and $k=1$. Consider $A=\{0,3\}$ and $B=\{2,5\}$.
$d_5^A=2$ and $d_5^B=3$
So $A'=A$ and $B'=B$, so $A$ becomes $B$ and $B$ becomes $A$, and the algorithm never stops.