Hierarchical Clustering

Using a user-defined distance (between records) & dissimilarity (between clusters) metric to cluster data. This clustering allows us to visualize the effect of clustering, with dendrogram.

Steps:

  1. Start by having all records as individual clusters
  2. Calculate the distance between all pairs of records
  3. Using 1 of 4 dissimilarity metrics, calculate the dissimilarity between all pairs of clusters
  4. Merge two clusters that are least dissimilar
  5. Continue until there’s 1 cluster