Kurz, C.* ; Merzhevich, T.* ; Eskofier, B.M. ; Kather, J.N.* ; Gmeiner, B.*
Benchmarking vision-language models for diagnostics in emergency and critical care settings.
NPJ Digit. Med. 8:423 (2025)
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Journal article
Document type
Scientific Article
Thesis type
Editors
Keywords
Keywords plus
Language
english
Publication Year
2025
Prepublished in Year
0
HGF-reported in Year
2025
ISSN (print) / ISBN
2398-6352
e-ISSN
2398-6352
ISBN
Book Volume Title
Conference Title
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 8,
Issue: 1,
Pages: ,
Article Number: 423
Supplement: ,
Series
Publisher
Nature Publishing Group
Publishing Place
Heidelberger Platz 3, Berlin, 14197, Germany
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Peer reviewed
POF-Topic(s)
30205 - Bioengineering and Digital Health
Research field(s)
Enabling and Novel Technologies
PSP Element(s)
G-540008-001
Grants
Novartis Pharma
Copyright
Erfassungsdatum
2025-07-14