PuSH - Publikationsserver des Helmholtz Zentrums München: Language models meet anomaly detection for better interpretability and generalizability.

Informationen

Hinweise zu Qualitätskriterien von Zeitschriften

Open-Access-Richtlinie der Helmholtz-Gemeinschaft 2016

CC Licencences

Metriken für Publikationen

Navigation

Startseite

English

Recherche

Erweiterte Suche

Durchblättern nach ...

... HMGU-Autoren/Konsortien

... Organisationsstruktur

... Zeitschriften

... Publikationstypen

... Forschungsdaten

... Arbeitsgruppen

... Erscheinungsjahr

Publikationen im Überblick

Statistik

HGF Fortschrittsbericht

OA Publikationen

Eintragen

Neue Publikation eintragen

Neue Publikation holen aus...

...EVA

Fehlende Publikation melden

Highlights

Suche

Hilfe & Kontakt

Ansprechpartner

Hilfe

Datenschutz

Helmholtz Open Science

Bibliometrische Indikatoren

SHERPA/RoMEO

DOAJ

Export:

Text

Endnote (RIS) BIB

BibTeX

Li, J.* ; Kim, S.H.* ; Müller, P.* ; Felsner, F.* ; Rueckert, D.* ; Wiestler, B.* ; Schnabel, J.A. ; Bercea, C.-I.

Language models meet anomaly detection for better interpretability and generalizability.

In: (Medical Image Computing and Computer Assisted Intervention – MICCAI 2024). Berlin [u.a.]: Springer, 2025. 113-123 (Lect. Notes Comput. Sc. ; 15401 LNCS)

Postprint

DOI

	Open Access Green

Abstract
Metriken
Zusatzinfos

This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model’s generalizability to previously unseen medical conditions. The code and dataset are available at: https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file.

Altmetric