PuSH - Publication Server of Helmholtz Zentrum München

Biases in machine-learning models of human single-cell data.

Nat. Cell Biol. 27, 384–392 (2025)
Postprint DOI PMC
Open Access Green
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Journal article
Document type Review
Corresponding Author
Keywords Genomics; Racism; Race
ISSN (print) / ISBN 1465-7392
e-ISSN 1476-4679
Quellenangaben Volume: 27, Issue: , Pages: 384–392 Article Number: , Supplement: ,
Publisher Nature Publishing Group
Publishing Place Heidelberger Platz 3, Berlin, 14197, Germany
Non-patent literature Publications
Reviewing status Peer reviewed
Grants Helmholtz Association under the joint research school 'Munich School for Data Science'