Hierarchical Clustering
Using a user-defined distance (between records) & dissimilarity (between clusters) metric to cluster data. This clustering allows us to visualize the effect of clustering, with dendrogram.
Steps:
- Start by having all records as individual clusters
- Calculate the distance between all pairs of records
- Using 1 of 4 dissimilarity metrics, calculate the dissimilarity between all pairs of clusters
- Merge two clusters that are least dissimilar
- Continue until there’s 1 cluster