as soon as is submitted to ZB.
A data-driven solution for the cold start problem in biomedical image classification.
In: (Proceedings - International Symposium on Biomedical Imaging, 27-30 May 2024, Athen). 2024. DOI: 10.1109/ISBI56570.2024.10635886 (Proceedings - International Symposium on Biomedical Imaging)
The demand for large quantities of high-quality annotated images poses a significant bottleneck for developing effective deep learning-based classifiers in the biomedical domain. We present a simple yet powerful solution to the cold start problem, i.e., selecting the most informative data for annotation within an unlabeled dataset. Our framework consists of three key components: (i) A self-supervised encoder to construct meaningful representations of unlabeled data, (ii) a sampling method selecting the most representative data points for annotation, and (iii) a classifier head using model ensembling to overcome the lack of validation data. We test our approach on four challenging public biomedical datasets. Our strategy outperforms the state-of-the-art approach in detecting the representative data points in all datasets and achieves a 7% improvement on a leukemia blood cell classification task. Our work offers a practical and efficient solution to the challenges associated with tedious and costly, high-quality data annotations in the biomedical field. We make our framework's code publicly available on https://github.com/marrlab/initial-data-point-selection.
Altmetric
Additional Metrics?
Edit extra informations
Login
Publication type
Article: Conference contribution
ISSN (print) / ISBN
1945-7928
e-ISSN
1945-8452
Conference Title
Proceedings - International Symposium on Biomedical Imaging
Conference Date
27-30 May 2024
Conference Location
Athen
Non-patent literature
Publications
Institute(s)
Institute of AI for Health (AIH)
Helmholtz Artifical Intelligence Cooperation Unit (HAICU)
Helmholtz Artifical Intelligence Cooperation Unit (HAICU)