PuSH - Publikationsserver des Helmholtz Zentrums München

A data-driven solution for the cold start problem in biomedical image classification.

In: (Proceedings - International Symposium on Biomedical Imaging, 27-30 May 2024, Athen). 2024. DOI: 10.1109/ISBI56570.2024.10635886 (Proceedings - International Symposium on Biomedical Imaging)
DOI
The demand for large quantities of high-quality annotated images poses a significant bottleneck for developing effective deep learning-based classifiers in the biomedical domain. We present a simple yet powerful solution to the cold start problem, i.e., selecting the most informative data for annotation within an unlabeled dataset. Our framework consists of three key components: (i) A self-supervised encoder to construct meaningful representations of unlabeled data, (ii) a sampling method selecting the most representative data points for annotation, and (iii) a classifier head using model ensembling to overcome the lack of validation data. We test our approach on four challenging public biomedical datasets. Our strategy outperforms the state-of-the-art approach in detecting the representative data points in all datasets and achieves a 7% improvement on a leukemia blood cell classification task. Our work offers a practical and efficient solution to the challenges associated with tedious and costly, high-quality data annotations in the biomedical field. We make our framework's code publicly available on https://github.com/marrlab/initial-data-point-selection.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Konferenzbeitrag
Korrespondenzautor
ISSN (print) / ISBN 1945-7928
e-ISSN 1945-8452
Konferenztitel Proceedings - International Symposium on Biomedical Imaging
Konferzenzdatum 27-30 May 2024
Konferenzort Athen
Nichtpatentliteratur Publikationen
Institut(e) Institute of AI for Health (AIH)
Helmholtz Artifical Intelligence Cooperation Unit (HAICU)