the centroid-based clustering is NP-hard

33 Views Asked by Bumbble Comm At 11 May 2026 - 5:00

I am trying to understand why the centroid-based clustering is NP-hard.

Assume we have one-dimensional sorted data set $x_{1}, \dots, x_{n}$ and we want to find, say $k$, optimal clusters (intervals) $\mathcal{X}_{1},\dots,\mathcal{X}_{k}$ is a sense of sum of $l_{2}$-distances to the corresponding centroids.

Since we can not solve it in a polinomial time, then, I assume, there must be a case that in the optimal set of clusters $\mathcal{X}_{1},\dots,\mathcal{X}_{k}$ there could be clusters with elements such that:

$x_{1}, x_{2} \in \mathcal{X}_{i}$, $x_{3}\in \mathcal{X}_{j}$ such that $x_{1} > x_{3}$ and $x_{3}>x_{2}$, i.e. intervals do not necessarily contain consecutive elements of the original sorted set $x_{1}, \dots, x_{n}$.

Is this correct?

Original Q&A

the centroid-based clustering is NP-hard

Related Questions in OPTIMIZATION

Related Questions in ALGORITHMS

Related Questions in CONVEX-OPTIMIZATION

Related Questions in COMPUTATIONAL-COMPLEXITY

Related Questions in CLUSTERING

Trending Questions

Popular # Hahtags

Popular Questions