Hücker, S.M.* ; Ardern, Z.* ; Goldberg, T.* ; Schafferhans, A.* ; Bernhofer, M.* ; Vestergaard, G. ; Nelson, C.W.* ; Schloter, M. ; Rost, B.* ; Scherer, S.* ; Neuhaus, K.*
     
 
    
        
Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli O157:H7 Sakai genome.
    
    
        
    
    
        
        PLoS ONE 12:e0184119 (2017)
    
    
    
		
		
			
				In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.
			
			
				
			
		 
		
			
				
					
					Impact Factor
					Scopus SNIP
					Web of Science
Times Cited
					Scopus
Cited By
					
					Altmetric
					
				 
				
			 
		 
		
     
    
        Publikationstyp
        Artikel: Journalartikel
    
 
    
        Dokumenttyp
        Wissenschaftlicher Artikel
    
 
    
        Typ der Hochschulschrift
        
    
 
    
        Herausgeber
        
    
    
        Schlagwörter
        Ribosome Profiling Experiments; Small Membrane-proteins; Frame-encoded Peptides; Open Reading Frames; Transcription Termination; Dark-matter; Human-cells; Rna-seq; Translation; Sequence
    
 
    
        Keywords plus
        
    
 
    
    
        Sprache
        englisch
    
 
    
        Veröffentlichungsjahr
        2017
    
 
    
        Prepublished im Jahr 
        
    
 
    
        HGF-Berichtsjahr
        2017
    
 
    
    
        ISSN (print) / ISBN
        1932-6203
    
 
    
        e-ISSN
        
    
 
    
        ISBN
        
    
 
    
        Bandtitel
        
    
 
    
        Konferenztitel
        
    
 
	
        Konferzenzdatum
        
    
     
	
        Konferenzort
        
    
 
	
        Konferenzband
        
    
 
     
		
    
        Quellenangaben
        
	    Band: 12,  
	    Heft: 9,  
	    Seiten: ,  
	    Artikelnummer: e0184119 
	    Supplement: ,  
	
    
 
  
        
            Reihe
            
        
 
        
            Verlag
            Public Library of Science (PLoS)
        
 
        
            Verlagsort
            Lawrence, Kan.
        
 
	
        
            Tag d. mündl. Prüfung
            0000-00-00
        
 
        
            Betreuer
            
        
 
        
            Gutachter
            
        
 
        
            Prüfer
            
        
 
        
            Topic
            
        
 
	
        
            Hochschule
            
        
 
        
            Hochschulort
            
        
 
        
            Fakultät
            
        
 
    
        
            Veröffentlichungsdatum
            0000-00-00
        
 
         
        
            Anmeldedatum
            0000-00-00
        
 
        
            Anmelder/Inhaber
            
        
 
        
            weitere Inhaber
            
        
 
        
            Anmeldeland
            
        
 
        
            Priorität
            
        
 
    
        Begutachtungsstatus
        Peer reviewed
    
 
     
    
        POF Topic(s)
        30202 - Environmental Health
    
 
    
        Forschungsfeld(er)
        Environmental Sciences
    
 
    
        PSP-Element(e)
        G-504700-001
    
 
    
        Förderungen
        
    
 
    
        Copyright
        
    
 	
    
    
    
    
    
        Erfassungsdatum
        2017-09-25