CluSim: a package for calculating clustering similarity.
A deep dive into clustering similarity.
Clustering similarity measures can be classified based on the cluster types: i) partitions that group elements into non-overlapping clusters, ii) hierarchical clusterings that group elements into a nested series of partitions (a.k.a. dendrogram), or …
Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering evaluation, …
We derive corrected variants of two clustering similarity measures (the Rand index and Mutual Information) in the context of two random clustering ensembles in which the number and sizes of clusters vary. In addition, we study the impact of one-sided comparisons in the scenario with a reference clustering. We demonstrate that the choice of random model can have a drastic impact on the ranking of similar clustering pairs, and the evaluation of a clustering method with respect to a random baseline; thus, the choice of random clustering model should be carefully justified.