PuSH - Publication Server of Helmholtz Zentrum München

Kurz, C.* ; Merzhevich, T.* ; Eskofier, B.M. ; Kather, J.N.* ; Gmeiner, B.*

Benchmarking vision-language models for diagnostics in emergency and critical care settings.

NPJ Digit. Med. 8:423 (2025)
Publ. Version/Full Text DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications.
Impact Factor
Scopus SNIP
Altmetric
15.100
0.000
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Language english
Publication Year 2025
HGF-reported in Year 2025
ISSN (print) / ISBN 2398-6352
e-ISSN 2398-6352
Quellenangaben Volume: 8, Issue: 1, Pages: , Article Number: 423 Supplement: ,
Publisher Nature Publishing Group
Publishing Place Heidelberger Platz 3, Berlin, 14197, Germany
Reviewing status Peer reviewed
POF-Topic(s) 30205 - Bioengineering and Digital Health
Research field(s) Enabling and Novel Technologies
PSP Element(s) G-540008-001
Grants Novartis Pharma
Scopus ID 105010500238
PubMed ID 40640347
Erfassungsdatum 2025-07-14