Bounding the Sum of Distances between Binary Vectors

78 Views Asked by At

I have a set of $n$ binary vectors $v_1,\ldots,v_n \subseteq \{0,1\}^d$ as well as a distance matrix $D \in \mathbb{Z}_{+}^{n\times n}$ where the $(i,j)^\text{th}$ entry is the distance between vectors $v_i$ and $v_j$: $$d_{i,j} = \sum^d_{i=1}|x_i - y_i|.$$

I am wondering if we can produce a meaningful upper bound on the sum of the entries of the distance matrix $\sum_{ij}{d_{ij}}$ using the pigeonhole principle?

Since $d_{ii}=0$ for all $i$, $d_{ij} = d_{ji}$ and $d_{ij} \in [0,d]$ for all $(i,j)$, it is easy to show that: $$\sum_{i=1}^{i=n}\sum_{j=1}^{j=n} d_{ij} = 2 \sum_{i=2}^{i=n}\sum_{j=1}^{j=i} d_{ij} \leq n(n-1).$$

However, this does not exploit the fact that if $d_{ij} = n$ and $d_{ik} = n$ then $d_{jk} = 0$ (since $v_j = v_k$).

1

There are 1 best solutions below

2
On BEST ANSWER

We can consider each coordinate independently. Let $v_{i, k}$ be the $k$th coordinate of $v_i$. We have $$ \sum_{i=1}^n \sum_{j=1}^n d_{i,j} =\sum_{i=1}^n\sum_{j=1}^n \sum_{k=1}^d |v_{i,k} - v_{j, k}| = \sum_{k=1}^d \left\{\sum_{i=1}^n \sum_{j=1}^n |v_{i, k} - v_{j, k}|\right\} $$ Considering the $k$th coordinate, let $S_{0,k}$ be the set of vectors whose $k$th coordinates are $0$ and $S_{1,k}$ be the remaining ones. It is easy to observe that $$ \sum_{i=1}^n \sum_{j=1}^n |v_{i,k} - v_{j, k}| = 2\cdot|S_{0,k}|\cdot|S_{1,k}| $$ Since $|S_{0,k}| + |S_{1,k}| = n$, $$ |S_{0,k}|\cdot|S_{1,k}| \leq \begin{cases} \frac{n^2}{4} &\text{ if } n \text{ is even}\\ \frac{n^2-1}{4} & \text{ if } n \text{ is odd} \end{cases} $$ Therefore, we conclude that $$ \sum_{i=1}^n\sum_{j=1}^n d_{i,j} \leq \begin{cases} \frac{dn^2}{2} &\text{ if } n \text{ is even}\\ \frac{d(n^2-1)}{2} & \text{ if } n \text{ is odd} \end{cases} $$ The upper bounds are tight since we can construct examples (by following the derivation above) to achieve the bounds.