PuSH - Publikationsserver des Helmholtz Zentrums München

Hücker, S.M.* ; Ardern, Z.* ; Goldberg, T.* ; Schafferhans, A.* ; Bernhofer, M.* ; Vestergaard, G. ; Nelson, C.W.* ; Schloter, M. ; Rost, B.* ; Scherer, S.* ; Neuhaus, K.*

Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli O157:H7 Sakai genome.

PLoS ONE 12:e0184119 (2017)
Verlagsversion Forschungsdaten DOI PMC
Open Access Gold
Creative Commons Lizenzvertrag
In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
2.806
1.092
13
13
Tags
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern

Zusatzinfos bearbeiten
Eigene Tags bearbeiten
Privat
Eigene Anmerkung bearbeiten
Privat
Auf Publikationslisten für
Homepage nicht anzeigen
Als besondere Publikation
markieren
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter Ribosome Profiling Experiments; Small Membrane-proteins; Frame-encoded Peptides; Open Reading Frames; Transcription Termination; Dark-matter; Human-cells; Rna-seq; Translation; Sequence
Sprache englisch
Veröffentlichungsjahr 2017
HGF-Berichtsjahr 2017
ISSN (print) / ISBN 1932-6203
Zeitschrift PLoS ONE
Quellenangaben Band: 12, Heft: 9, Seiten: , Artikelnummer: e0184119 Supplement: ,
Verlag Public Library of Science (PLoS)
Verlagsort Lawrence, Kan.
Begutachtungsstatus Peer reviewed
POF Topic(s) 30202 - Environmental Health
Forschungsfeld(er) Environmental Sciences
PSP-Element(e) G-504700-001
PubMed ID 28902868
Scopus ID 85029431937
Erfassungsdatum 2017-09-25