- randomly initialize K cluster centroids μ 1 , μ 2 , ? \mu_1, \mu_2, \cdots μ1?,μ2?,?
- repeat:
- assign each point to its closest centroid μ \mu μ
- recompute the centroids(average of the closest point)
- c ( i ) c^{(i)} c(i) = index of cluster to which example x ( i ) x^{(i)} x(i) is currently assigned
- μ k \mu_k μk? = cluster centroid k
- μ c ( i ) \mu_{c^{(i)}} μc(i)? = cluster centroid of cluster to which example x ( i ) x^{(i)} x(i) has been assigned
J = 1 m ∑ i = 1 m ∥ x ( i ) ? μ c ( i ) ∥ J = \frac{1}{m} \sum_{i=1}^m \| x^{(i)} - \mu_{c^{(i)}} \| J=m1?i=1∑m?∥x(i)?μc(i)?∥
for i = 1 to n(usually 50 to 1000)
randomly initialize K-means
run K-means
compute cost function
pick set of clusters that give the lowest cost