PuSH - Publikationsserver des Helmholtz Zentrums München

Kurz, C.* ; Merzhevich, T.* ; Eskofier, B.M. ; Kather, J.N.* ; Gmeiner, B.*

Benchmarking vision-language models for diagnostics in emergency and critical care settings.

NPJ Digit. Med. 8:423 (2025)
Verlagsversion DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Korrespondenzautor
ISSN (print) / ISBN 2398-6352
e-ISSN 2398-6352
Zeitschrift NPJ digital medicine
Quellenangaben Band: 8, Heft: 1, Seiten: , Artikelnummer: 423 Supplement: ,
Verlag Nature Publishing Group
Nichtpatentliteratur Publikationen
Begutachtungsstatus Peer reviewed
Institut(e) Institute of AI for Health (AIH)