Showing K-means as minimizing a weighted average of intra-cluster deviation and maximizing the inter-cluster deviation

88 Views Asked by At

I'm currently working on an assignment that asks to establish a relationship between intra cluster distance minimization, inter cluster distance maximization and the variance of the dataset. Here are some of the definitions I am working with.

base formulas

In the above, T(X) is the variance of the dataset X, W(X) is intra cluster deviation and B(X) is inter cluster deviation.

The problem I am having is getting the right hand side of this equality in the form of (n^2)*T(X) + some set of terms. I understand that I can factor out gamma ij to then expand the squared L2 norms to get something along the lines of (excluding the two summations before gamma)

gamma ij(||xi - x̂||^2 - 2 ||xi||||u|| - 2 ||u||||x̂|| + 2 ||u||^2)

The first term inside the parentheses is very close to T(X), however I'm unsure how to manipulate the summations with gamma being present. Without gamma,I could easily break it up such that values with an i subscript are multiplied by k and values with a j subscript are multiplied by n. However, gamma is a matrix with 0/1 values that uses both i and j. As such I'm more or less stuck and would like some advice on possible avenues from here.