CluSim: a package for calculating clustering similarity.

Clustering Analysis

A deep dive into clustering similarity.

CluSim: a python package for calculating clustering similarity

Clustering similarity measures can be classified based on the cluster types: i) partitions that group elements into non-overlapping clusters, ii) hierarchical clusterings that group elements into a nested series of partitions (a.k.a. dendrogram), or …

Element-centric clustering comparison unifies overlaps and hierarchy

Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering evaluation, …

The impact of random models on clustering similarity

We derive corrected variants of two clustering similarity measures (the Rand index and Mutual Information) in the context of two random clustering ensembles in which the number and sizes of clusters vary. In addition, we study the impact of one-sided comparisons in the scenario with a reference clustering. We demonstrate that the choice of random model can have a drastic impact on the ranking of similar clustering pairs, and the evaluation of a clustering method with respect to a random baseline; thus, the choice of random clustering model should be carefully justified.