Kreimer, A.* ; Zeng, H.* ; Edwards, M.D.* ; Guo, Y.* ; Tian, K.* ; Shin, S.* ; Welch, R.* ; Wainberg, M.* ; Mohan, R.* ; Sinnott-Armstrong, N.A.* ; Li, Y.* ; Eraslan, G. ; Amin, T.B.* ; Goke, J.* ; Müller, N.S. ; Kellis, M.* ; Kundaje, A.* ; Beer, M.A.* ; Keles, S.* ; Gifford, D.K.* ; Yosef, N.*
Predicting gene expression in massively parallel reporter assays: A comparative study.
Hum. Mutat. 38, 1240-1250 (2017)
In many human diseases, associated genetic changes tend to occur within non-coding regions, whose effect might be related to transcriptional control. A central goal in human genetics is to understand the function of such non-coding regions: Given a region that is statistically associated with changes in gene expression (expression Quantitative Trait Locus; eQTL), does it in fact play a regulatory role? And if so, how is this role "coded" in its sequence? These questions were the subject of the Critical Assessment of Genome Interpretation eQTL challenge. Participants were given a set of sequences that flank eQTLs in humans and were asked to predict whether these are capable of regulating transcription (as evaluated by massively parallel reporter assays), and whether this capability changes between alternative alleles. Here, we report lessons learned from this community effort. By inspecting predictive properties in isolation, and conducting meta-analysis over the competing methods, we find that using chromatin accessibility and transcription factor binding as features in an ensemble of classifiers or regression models leads to the most accurate results. We then characterize the loci that are harder to predict, putting the spotlight on areas of weakness, which we expect to be the subject of future studies.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Journal article
Document type
Scientific Article
Thesis type
Editors
Keywords
Eqtls ; Functional Genomics ; Gene Regulation ; Massive Paralleled Reporter Assays; Protein-dna Interactions; Binding Microarray Data; Human Genome; In-vivo; Transcriptional Regulation; Systematic Dissection; Regulatory Motifs; Online Database; Chromatin; Variants
Keywords plus
Language
english
Publication Year
2017
Prepublished in Year
HGF-reported in Year
2017
ISSN (print) / ISBN
1059-7794
e-ISSN
1098-1004
ISBN
Book Volume Title
Conference Title
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 38,
Issue: 9,
Pages: 1240-1250
Article Number: ,
Supplement: ,
Series
Publisher
Wiley
Publishing Place
Hoboken
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Peer reviewed
POF-Topic(s)
30205 - Bioengineering and Digital Health
Research field(s)
Enabling and Novel Technologies
PSP Element(s)
G-503800-001
Grants
Copyright
Erfassungsdatum
2017-05-24