Flat clustering (Chapter 16) Algorithm 1 K-means({⃗𝑥1, . . . , ⃗𝑥 𝑁 }, 𝐾, stopping criterion) 1: (⃗𝑠1, . . . ,⃗𝑠 𝐾) ← SelectRandomSeeds({⃗𝑥1, . . . , ⃗𝑥 𝑁 }, 𝐾) 2: for 𝑘 ← 1 to 𝐾 do 3: ⃗𝜇 𝑘 ← ⃗𝑠 𝑘 4: end for 5: repeat 6: for 𝑘 ← 1 to 𝐾 do 7: 𝜔 𝑘 ← {} 8: end for 9: for 𝑛 ← 1 to 𝑁 do 10: 𝑗 ← argmin 𝑗′ |⃗𝜇 𝑗′ − ⃗𝑥 𝑛| 11: 𝜔 𝑗 ← 𝜔 𝑗 ∪ {⃗𝑥 𝑛} ◁ reassigning vectors 12: end for 13: for 𝑘 ← 1 to 𝐾 do 14: ⃗𝜇 𝑘 ← 1 |𝜔 𝑘| ∑︀ ⃗𝑥∈𝜔 𝑘 ⃗𝑥 ◁ recomputing centroids 15: end for 16: until a stopping criterion has been met 17: return {⃗𝜇1, . . . , ⃗𝜇 𝐾} Exercise 16/1 Use the 𝐾-means algorithm with Euclidean distance to cluster the following 𝑁 = 8 examples into 𝐾 = 3 clusters: 𝐴1 = (2, 10), 𝐴2 = (2, 5), 𝐴3 = (8, 4), 𝐴4 = (5, 8), 𝐴5 = (7, 5), 𝐴6 = (6, 4), 𝐴7 = (1, 2), 𝐴8 = (4, 9). Suppose that the initial seeds (centers of each cluster) are 𝐴1, 𝐴4 and 𝐴7. Run the 𝐾-means algorithm for 3 epochs. After each epoch, draw a 10 × 10 space with all the 8 points and show the clusters with the new centroids. Exercise 16/2 What makes a good clustering? Give some clustering evaluation metrics. 1