Language models meet anomaly detection for better interpretability and generalizability.
In: (Medical Image Computing and Computer Assisted Intervention – MICCAI 2024). Berlin [u.a.]: Springer, 2025. 113-123 (Lect. Notes Comput. Sc. ; 15401 LNCS)
This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model’s generalizability to previously unseen medical conditions. The code and dataset are available at: https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Conference contribution
Document type
Thesis type
Editors
Keywords
Multimodal Learning ; Vision-language Models ; Vqa
Keywords plus
Language
english
Publication Year
2025
Prepublished in Year
0
HGF-reported in Year
2025
ISSN (print) / ISBN
0302-9743
e-ISSN
1611-3349
ISBN
Book Volume Title
Conference Title
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 15401 LNCS,
Issue: ,
Pages: 113-123
Article Number: ,
Supplement: ,
Series
Publisher
Springer
Publishing Place
Berlin [u.a.]
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Institute(s)
Institute for Machine Learning in Biomed Imaging (IML)
POF-Topic(s)
30205 - Bioengineering and Digital Health
Research field(s)
Enabling and Novel Technologies
PSP Element(s)
G-507100-001
Grants
Copyright
Erfassungsdatum
2025-05-22