Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibility.
NAR Gen. Bioinfo. 5:lqad026 (2023)
Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecular data from DNA sequence but are limited to large input data for training. Here, we develop ChromTransfer, a transfer learning method that uses a pre-trained, cell-type agnostic model of open chromatin regions as a basis for fine-tuning on regulatory sequences. We demonstrate superior performances with ChromTransfer for learning cell-type specific chromatin accessibility from sequence compared to models not informed by a pre-trained model. Importantly, ChromTransfer enables fine-tuning on small input data with minimal decrease in accuracy. We show that ChromTransfer uses sequence features matching binding site sequences of key transcription factors for prediction. Together, these results demonstrate ChromTransfer as a promising tool for learning the regulatory code.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Journal article
Document type
Scientific Article
Thesis type
Editors
Keywords
Transcription Factors; Gene-expression; Enhancers; Binding; Genome
Keywords plus
Language
english
Publication Year
2023
Prepublished in Year
0
HGF-reported in Year
2023
ISSN (print) / ISBN
2631-9268
e-ISSN
2631-9268
ISBN
Book Volume Title
Conference Title
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 5,
Issue: 2,
Pages: ,
Article Number: lqad026
Supplement: ,
Series
Publisher
Oxford University Press
Publishing Place
Great Clarendon St, Oxford Ox2 6dp, England
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Peer reviewed
POF-Topic(s)
30205 - Bioengineering and Digital Health
Research field(s)
Enabling and Novel Technologies
PSP Element(s)
G-503800-004
G-503800-001
Grants
Copyright
Erfassungsdatum
2023-10-06