PuSH - Publikationsserver des Helmholtz Zentrums München

Li, J.* ; Kim, S.H.* ; Müller, P.* ; Felsner, F.* ; Rueckert, D.* ; Wiestler, B.* ; Schnabel, J.A. ; Bercea, C.-I.

Language models meet anomaly detection for better interpretability and generalizability.

In: (Medical Image Computing and Computer Assisted Intervention – MICCAI 2024). Berlin [u.a.]: Springer, 2025. 113-123 (Lect. Notes Comput. Sc. ; 15401 LNCS)
Postprint DOI
Open Access Green
This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model’s generalizability to previously unseen medical conditions. The code and dataset are available at: https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Konferenzbeitrag
Korrespondenzautor
Schlagwörter Multimodal Learning ; Vision-language Models ; Vqa
ISSN (print) / ISBN 0302-9743
e-ISSN 1611-3349
Konferenztitel Medical Image Computing and Computer Assisted Intervention – MICCAI 2024
Quellenangaben Band: 15401 LNCS, Heft: , Seiten: 113-123 Artikelnummer: , Supplement: ,
Verlag Springer
Verlagsort Berlin [u.a.]
Nichtpatentliteratur Publikationen
Institut(e) Institute for Machine Learning in Biomed Imaging (IML)