imbalanced clustering by optimizing normalized cut

90 Views Asked by At

I have been studying spectral graph partitioning and I found that this algorithm optimize normalized cut by taking equal size cluster, which might not be a real world scenario. Is there any algorithm that optimize normalized cut but not necessarily taking equal size cluster?

1

There are 1 best solutions below

0
On

A better approach is to find an affinity matrix $\bf A$, for example: $${\bf A}_{ij} = \exp(-|{\bf D}_{ij}|)$$ where $\bf D$ is a distance matrix, for example: $${\bf D}_{ij} = |{\bf d}_i-{\bf d}_j|$$ Where $\bf d$ is your data vector ( assuming it 1 dimensional for simplicity ).

If you do this, then you can search for the second largest eigenvalue of your affinity matrix, and corresponding eigenvector, and your two directions for cut are positive real and negative real. So 0 as element in your eigenvector is the cut decision boundary. This will find cuts where the relation of the sizes of your clusters can be arbitrary.

To find one of the largest eigenvalues and corresponding eigenvectors, popular methods include power iteration and Krylov subspace methods such as Lanczos iterations.


EDIT To solve the issue you face you may want to remove the mean value of your eigenvector. Otherwise 0 won't be the cut decision boundary as i claimed above.