Please help me understand how do you obtain
$$J' = 2\sum_{n=1}^{N}r_{nk}(x_{n} - \mu_{k})$$
from
$$J = \sum_{n=1}^{N}\sum_{k=1}^{K}r_{nk}\left \| x_{n} - \mu _{k} \right \|^{2}$$
Given
- N data points $x^n \space(n=1,...,N)$
- K clusters
- $r_{nk}$: 1-of-K coding scheme
- $\mu_{k}$: Cluster centres
when deriving J that is the distortion function that should be optimized by finding $\mu_{j}$ and $r_{nk}$ such that J is a minimum. Also, how do you solve $J'$ for $\mu_{j}$?. The result is
$$\mu_{k} = \frac{\sum_{n} r_{nk}x_{n}}{\sum_{n} r_{nk}}$$
but I want to understand the procedure to obtain it