What is the correct formula for Within Cluster Sum of Squares

294 Views Asked by At

I am studying clustering with K-Means algorithm and I got stumbled in the "inertia", or "within cluster sum of squares" part. First I would appreciate if anyone could explain me the difference between this two terms or if they are the same. The problem I faced was when searching for the within cluster sum of squares on google images and finding this formulas:

$WSS\:=\:\sum _{i=1}^{N_c}\:\sum _{x\in C_i}^{ }\:d\left(x,\:\overline{x}_{C_{_i}}\right)^2$

From this image: img 1

$WCSS\:=\:\sum _{C_{_k}}^{C_n}\:\left(\sum \:_{d_i\:in\:C_i}^{d_m\:}\:distance\left(d_i,\:C_{_k}\right)^2\right)$

From this image: img 2

And lastly:

$WCSS\:=\frac{1}{N}\:\sum _{i=1}^K\:\sum \:_{j=1}^{n_i\:}\left(x_{ij}\:-\:\overline{x_i}\right)^2$

From this image:img 3

The last imagem even calls it "Total Within Cluster Sum Of Squares".

I know that the inertia formula for a single cluster is:

$\sum \:_{i=1}^{N\:}\left(x_i\:-\:C_{_k}\right)^2$

Which actually makes sense to me, so based on this first formula, the one from the first image is the one who makes more sense.