Why doesn't k-means give the global minima?

1.7k Views Asked by At

I read that the k-means algorithm only converges to a local minima and not to a global minima. Why is this? I can logically think of how initialization could affect the final clustering and there is a possibility of sub-optimum clustering, but I did not find anything that will mathematically prove that.

Also, why is k-means an iterative process? Can't we just partially differentiate the objective function w.r.t. to the centroids, equate it to zero to find the centroids that minimizes this function? Why do we have to use gradient descent to reach the minimum step by step?