PuSH - Publication Server of Helmholtz Zentrum München

Knolle, M.* ; Menten, M.J.* ; Rueckert, D.* ; Kaissis, G. ; Glocker, B.*

Memorisation Bias: AI Predictions for Data Contributors Are Biased Towards Their Health States in the Training Data.

In:. Berlin [u.a.]: Springer, 2026. 24-33 (Lect. Notes Comput. Sc. ; 16184 LNCS)
DOI
AI models are increasingly deployed in clinical practice to assist doctors in diagnostic or screening tasks. However, a critical concern arises from the inherent ability of modern AI models to memorise individual examples from their training datasets. Such memorisation could lead to inaccurate predictions when a model is later used on individuals whose historical data was (potentially unknowingly) used for model training or fine-tuning. In this study, we discover evidence for memorisation bias in two large medical imaging datasets: CheXpert (chest radiography) and Kermany-OCT (optical coherence tomography). Our experiments reveal that a small proportion of data-contributing patients (0.6% and 1.1% for CheXpert/Kermany-OCT, respectively) exhibit significant changes in their predictions on (future) longitudinal evaluation data when their historical data is included for model training. Strikingly, we find that larger, more diagnostically accurate models exhibit increased memorisation bias: for Kermany-OCT, the number of data-contributing patients affected by memorisation increases substantially (from 1.1% to 7.2%) when scaling model size from 1.5 to 80 million parameters. Together, our results raise the question whether the future health outcomes of data-contributing patients could be adversely affected by memorisation bias, i.e., predictions which are biased towards their previous health states.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Conference contribution
Keywords Deep Learning ; Medical Imaging ; Memorisation
ISSN (print) / ISBN 0302-9743
e-ISSN 1611-3349
Quellenangaben Volume: 16184 LNCS, Issue: , Pages: 24-33 Article Number: , Supplement: ,
Publisher Springer
Publishing Place Berlin [u.a.]
Institute(s) Institute for Machine Learning in Biomed Imaging (IML)