How does Hartigan & Wong algorithm compare to Lloyd's and Macqueen's algorithm in K-means clustering?

Question

As far I know, this is how the latter two algorithms work...

Lloyd's algorithm

Choose the number of clusters.
Choose a distance metric (typically squared euclidean).
Randomly assign each observation to a cluster and compute the cluster centroids.
Iterate below until convergence (i.e. until cluster centroids stop changing):

Assign each observation point to the cluster whose centroid is closest.
Update cluster centroids only after a complete pass through all observations.

Macqueen's Algorithm

Choose the number of clusters.
Choose a distance metric (typically squared euclidean).
Randomly assign each observation to a cluster and compute the cluster centroids.
Perform a complete pass of below (i.e. go through all observations):

Assign an observation to a cluster whose centroid is closest.
Immediately update the centroids for the two affected clusters (i.e. for the cluster that lost an observation and for the cluster that gained it).

Update centroids after a complete pass.

How does the Hartigan & Wong algorithm compare to these two above? I read this paper in an effort to understand but it's still not clear to me. The first three steps is the same as Lloyd's and Macqueen's algorithm (as described above), but then what does the algorithm do? Does it update the centroids as often as Macqueen's algorithm does, or as often as Lloyd's algorithm does? At what point does it take into consideration the within-cluster sum of squares and how does it fit into the algorithm?

I'm generally confused when it comes to this algorithm and would very much appreciate a step-wise explanation as to what's going on.

How does Hartigan & Wong algorithm compare to Lloyd's and Macqueen's algorithm in K-means clustering?

0 Answers0