How machine learning and statistical models advance molecular diagnostics of rare disorders via analysis of RNA sequencing data.
Front. Mol. Biosci. 8:647277 (2021)
Rare diseases, although individually rare, collectively affect approximately 350 million people worldwide. Currently, nearly 6,000 distinct rare disorders with a known molecular basis have been described, yet establishing a specific diagnosis based on the clinical phenotype is challenging. Increasing integration of whole exome sequencing into routine diagnostics of rare diseases is improving diagnostic rates. Nevertheless, about half of the patients do not receive a genetic diagnosis due to the challenges of variant detection and interpretation. During the last years, RNA sequencing is increasingly used as a complementary diagnostic tool providing functional data. Initially, arbitrary thresholds have been applied to call aberrant expression, aberrant splicing, and mono-allelic expression. With the application of RNA sequencing to search for the molecular diagnosis, the implementation of robust statistical models on normalized read counts allowed for the detection of significant outliers corrected for multiple testing. More recently, machine learning methods have been developed to improve the normalization of RNA sequencing read count data by taking confounders into account. Together the methods have increased the power and sensitivity of detection and interpretation of pathogenic variants, leading to diagnostic rates of 10–35% in rare diseases. In this review, we provide an overview of the methods used for RNA sequencing and illustrate how these can improve the diagnostic yield of rare diseases.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Review
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Aberrant Expression ; Aberrant Splicing ; Machine Learning ; Mono-allelic Expression ; Rare Disorders ; Rna Sequencing ; Statistical Models; Principal Component Analysis; Gene-expression; Seq Data; Human Transcriptome; Variants; Disease; Genome; Identification; Complexity; Mutations
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2021
Prepublished im Jahr
HGF-Berichtsjahr
2021
ISSN (print) / ISBN
2296-889X
e-ISSN
2296-889X
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 8,
Heft: ,
Seiten: ,
Artikelnummer: 647277
Supplement: ,
Reihe
Verlag
Frontiers
Verlagsort
Lausanne
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
POF Topic(s)
30205 - Bioengineering and Digital Health
Forschungsfeld(er)
Genetics and Epidemiology
PSP-Element(e)
G-503292-001
Förderungen
Bavarian State Ministry of Health and Care
Medical Informatics Initiative CORD-MI (Collaboration on Rare Diseases)
PerMiM Personalized Mitochondrial Medicine
BMBF (German Federal Ministry of Education and Research) through mitoNET German Network for Mitochondrial Diseases
Copyright
Erfassungsdatum
2021-06-23