Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spectralis leverages several innovations including a convolutional neural network layer connecting peaks in spectra spaced by amino acid masses, proposing fragment ion series classification as a pivotal task for de novo peptide sequencing, and a peptide-spectrum confidence score. On spectra for which database search provided a ground truth, Spectralis surpassed 40% sensitivity at 90% precision, nearly doubling state-of-the-art sensitivity. Application to unidentified spectra confirmed its superiority and showcased its applicability to variant calling. Altogether, these algorithmic innovations and the substantial sensitivity increase in the high-precision range constitute an important step toward broadly applicable peptide sequencing.
FörderungenTUM Munich Data Science Institute (MDSI) seed fund GPU infrastructure Bundesministerium fr Bildung und Forschung (BMBF) Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Bundesministerium fur Bildung und Forschung (BMBF) European Union This work is supported by the Bundesministerium fr Bildung und Forschung (BMBF) through the project CLINSPECT-M (FKZ031L0214A)