Sicilia, C.* ; Corral-Lugo, A.* ; Smialowski, P. ; McConnell, M.J.* ; Martín-Galiano, A.J.*
Unsupervised machine learning organization of the functional dark proteome of gram-negative "superbugs": Six protein clusters amenable for distinct scientific applications.
ACS Omega 7, 46131–46145 (2022)
Uncharacterized proteins have been underutilized as targets for the development of novel therapeutics for difficult-to-treat bacterial infections. To facilitate the exploration of these proteins, 2819 predicted, uncharacterized proteins (19.1% of the total) from reference strains of multidrug Acinetobacter baumannii, Klebsiella pneumoniae, and Pseudomonas aeruginosa species were organized using an unsupervised k-means machine learning algorithm. Classification using normalized values for protein length, pI, hydrophobicity, degree of conservation, structural disorder, and %AT of the coding gene rendered six natural clusters. Cluster proteins showed different trends regarding operon membership, expression, presence of unknown function domains, and interactomic relevance. Clusters 2, 4, and 5 were enriched with highly disordered proteins, nonworkable membrane proteins, and likely spurious proteins, respectively. Clusters 1, 3, and 6 showed closer distances to known antigens, antibiotic targets, and virulence factors. Up to 21.8% of proteins in these clusters were structurally covered by modeling, which allowed assessment of druggability and discontinuous B-cell epitopes. Five proteins (4 in Cluster 1) were potential druggable targets for antibiotherapy. Eighteen proteins (11 in Cluster 6) were strong B-cell and T-cell immunogen candidates for vaccine development. Conclusively, we provide a feature-based schema to fractionate the functional dark proteome of critical pathogens for fundamental and biomedical purposes.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publication type
Article: Journal article
Document type
Scientific Article
Thesis type
Editors
Keywords
Keywords plus
Language
english
Publication Year
2022
Prepublished in Year
HGF-reported in Year
2022
ISSN (print) / ISBN
2470-1343
e-ISSN
2470-1343
ISBN
Book Volume Title
Conference Title
Conference Date
Conference Location
Proceedings Title
Quellenangaben
Volume: 7,
Issue: 50,
Pages: 46131–46145
Article Number: ,
Supplement: ,
Series
Publisher
American Chemical Society (ACS)
Publishing Place
Day of Oral Examination
0000-00-00
Advisor
Referee
Examiner
Topic
University
University place
Faculty
Publication date
0000-00-00
Application date
0000-00-00
Patent owner
Further owners
Application country
Patent priority
Reviewing status
Peer reviewed
POF-Topic(s)
30204 - Cell Programming and Repair
Research field(s)
Stem Cell and Neuroscience
PSP Element(s)
G-500800-001
Grants
Instituto de Salud Carlos III
Comunidad de Madrid
Copyright
Erfassungsdatum
2022-12-20