Compute gradient in $k$-means

26 Views Asked by At

For the $k$-means problem, we fix a partition $C_1,\dots,C_n$ of $[n] := \{1,2,\dots,n\}$. Then, under the $k$-means objective, the cost is

$$ G(C_1, \dots, C_n) = \min_{\mu_1, \dots, \mu_k \in \mathbb{R}^d} G(C_1, \dots, C_k; \mu_1, \dots,\mu_k) $$

where

$$G(C_1,\dots, C_k; \mu_1,\dots,\mu_k) = \sum_{i=1}^k \sum_{j \in C_i} \| \mathbf{x}_j-\mu_i \|^2 = \sum_{j=1}^n \| \mathbf{x}_j - \mu_{C(j)} \|^2$$

where $\mu_i$'s are the center of $C_i$.


I want to calculate the gradient of this function: (bold $x$ and $\mu$ are of the same dimension) $$F_i(\mu_i)=\sum_{j \in C_i}||\mathbf{x_j}-\mu_i||^2$$


First I am confused with the definition, does gradient here mean: $$\left[ \frac{F_i(\mu_i)}{\mu_1},\cdots, \frac{F_i(\mu_i)}{\mu_n}\right]^T?$$

Then I first expand the term:

$$F_i(\mu_i)=\sum_{j \in C_i}||\mathbf{x_j}-\mu_i||^2=\sum_{j \in C_i}||\mathbf{x_j}||^2-2\mathbf{x}^T\mu_i+||\mu_i||^2$$ I also tried another method using the definition of norm: $$F_i(\mu_i)=\sum_{j \in C_i}||\mathbf{x_j}-\mu_i||^2=\sum_{j \in C_i} \sum_{k=1}^d (\mathbf{x}_j-\mu_i)^2$$ But either way, I got stuck because of the indices. Can anyone help me or give me some hints?