The impact of random models on standardized clustering similarity.
IEEE Access 12, 179879-179890 (2024)
Clustering similarity measures are essential for evaluating clustering results and ensuring diversity in multiple clusterings of the same dataset. Common indices like the Mutual Information (MI) and Rand Index (RI) are biased towards smaller clusters and are often adjusted using a random permutation model. Recent advancements have standardized these measures to further correct biases, but the impact of different random models on these standardized measures has not yet been studied. In this work, we introduce equations for standardizing the MI/RI under non-permutation models, specifically focusing on a uniform model over all clusterings and a model that fixes the number of clusterings. Our results show that while standardization improves performance for the fixed number of clusters model, its benefits are limited in the more general uniform model. We validate our findings with gene expression data, highlighting the importance of choosing the right similarity metric for clustering comparison.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Clustering Comparison ; External Evaluation Metrics ; Machine Learning ; Mutual Information ; Rand Index ; Random Model
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2024
Prepublished im Jahr
0
HGF-Berichtsjahr
2024
ISSN (print) / ISBN
2169-3536
e-ISSN
2169-3536
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 12,
Heft: ,
Seiten: 179879-179890
Artikelnummer: ,
Supplement: ,
Reihe
Verlag
IEEE
Verlagsort
445 Hoes Lane, Piscataway, Nj 08855-4141 Usa
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
POF Topic(s)
30205 - Bioengineering and Digital Health
Forschungsfeld(er)
Enabling and Novel Technologies
PSP-Element(e)
G-540008-001
Förderungen
Digital Europe Grant Testing and Experimentation Facility for Health Artificial Intelligence (AI) and Robotics (TEF-Health)
Copyright
Erfassungsdatum
2024-12-09