PuSH - Publikationsserver des Helmholtz Zentrums München

Sun, B.* ; Smialowski, P. ; Straub, T.* ; Imhof, A.*

Investigation and highly accurate prediction of missed tryptic cleavages by deep learning.

J. Proteome Res. 20, 3749–3757 (2021)
DOI PMC
Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
Trypsin is one of the most important and widely used proteolytic enzymes in mass spectrometry (MS)-based proteomic research. It exclusively cleaves peptide bonds at the C-terminus of lysine and arginine. However, the cleavage is also affected by several factors, including specific surrounding amino acids, resulting in frequent incomplete proteolysis and subsequent issues in peptide identification and quantification. The accurate annotations on missed cleavages are crucial to database searching in MS analysis. Here, we present deep-learning predicting missed cleavages (dpMC), a novel algorithm for the prediction of missed trypsin cleavage sites. This algorithm provides a very high accuracy for predicting missed cleavages with area under the curves (AUCs) of cross-validation and holdout testing above 0.99, along with the mean F1 score and the Matthews correlation coefficient (MCC) of 0.9677 and 0.9349, respectively. We tested our algorithm on data sets from different species and different experimental conditions, and its performance outperforms other currently available prediction methods. In addition, the method also provides a better insight into the detailed rules of trypsin cleavages coupled with propensity and motif analysis. Moreover, our method can be integrated into database searching in the MS analysis to identify and quantify mass spectra effectively and efficiently.
Impact Factor
Scopus SNIP
Altmetric
4.466
1.016
Tags
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern

Zusatzinfos bearbeiten
Eigene Tags bearbeiten
Privat
Eigene Anmerkung bearbeiten
Privat
Auf Publikationslisten für
Homepage nicht anzeigen
Als besondere Publikation
markieren
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter Deep Learning ; Mass Spectrometry ; Missed Cleavage ; Prediction ; Trypsin; Active-site; Identification; Proteases; Peptides
Sprache englisch
Veröffentlichungsjahr 2021
HGF-Berichtsjahr 2021
ISSN (print) / ISBN 1535-3893
e-ISSN 1535-3907
Quellenangaben Band: 20, Heft: 7, Seiten: 3749–3757 Artikelnummer: , Supplement: ,
Verlag American Chemical Society (ACS)
Verlagsort 1155 16th St, Nw, Washington, Dc 20036 Usa
Begutachtungsstatus Peer reviewed
POF Topic(s) 30204 - Cell Programming and Repair
Forschungsfeld(er) Stem Cell and Neuroscience
PSP-Element(e) G-500800-001
Förderungen Deutsche Forschungsgemeinschaft
Chinese Scholarship Council
Scopus ID 85110383029
PubMed ID 34137619
Erfassungsdatum 2021-07-19