I have come across something like $H^TDH$ where H is an assignment matrix $N\times k$, where $N$ is the number of samples and $k$ is the number of clusters. Let $D$ be the distance matrix which represents pairwise distance between input variables and has only diagonal elements $1$ and and all other entries represent the pairwise distances between samples.
My question is why is it useful in clustering algorithms or in general to transform the pairwise distance matrix $D$ into the basis of $H$, resulting in the matrix $H^T D H$?
Specifically, how does this transformation make computations more efficient and reduce the convergence time of clustering algorithms, and when should one apply this transformation and when is it not necessary?