PuSH - Publication Server of Helmholtz Zentrum München

Jarchow, H.* ; Bobrowski, C.* ; Falk, S.* ; Hermann, A.* ; Kulaga, A.* ; Põder, J.C.* ; Unfried, M.* ; Usanov, N.* ; Zendeh, B.* ; Kennedy, B.K.* ; Lobentanzer, S. ; Fuellen, G.*

Benchmarking large language models for personalized, biomarker-based health intervention recommendations.

NPJ Digit. Med. 8:631 (2025)
Publ. Version/Full Text Research data DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
The use of large language models (LLMs) in clinical diagnostics and intervention planning is expanding, yet their utility for personalized recommendations for longevity interventions remains opaque. We extended the BioChatter framework to benchmark LLMs' ability to generate personalized longevity intervention recommendations based on biomarker profiles while adhering to key medical validation requirements. Using 25 individual profiles across three different age groups, we generated 1000 diverse test cases covering interventions such as caloric restriction, fasting and supplements. Evaluating 56000 model responses via an LLM-as-a-Judge system with clinician validated ground truths, we found that proprietary models outperformed open-source models especially in comprehensiveness. However, even with Retrieval-Augmented Generation (RAG), all models exhibited limitations in addressing key medical validation requirements, prompt stability, and handling age-related biases. Our findings highlight limited suitability of LLMs for unsupervised longevity intervention recommendations. Our open-source framework offers a foundation for advancing AI benchmarking in various medical contexts.
Impact Factor
Scopus SNIP
Altmetric
15.100
0.000
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Language english
Publication Year 2025
HGF-reported in Year 2025
ISSN (print) / ISBN 2398-6352
e-ISSN 2398-6352
Quellenangaben Volume: 8, Issue: 1, Pages: , Article Number: 631 Supplement: ,
Publisher Nature Publishing Group
Reviewing status Peer reviewed
POF-Topic(s) 30205 - Bioengineering and Digital Health
Research field(s) Enabling and Novel Technologies
PSP Element(s) G-503800-001
Scopus ID 105019758288
PubMed ID 41145883
Erfassungsdatum 2025-10-29