Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
Feature selection pipelines with classification for non-targeted metabolomics combining the neural network and genetic algorithm.
Anal. Chem. 94, 5474-5482 (2022)
Non-targeted metabolomics via high-resolution mass spectrometry methods, such as direct infusion Fourier transform-ion cyclotron resonance mass spectrometry (DI-FT-ICR MS), produces data sets with thousands of features. By contrast, the number of samples is in general substantially lower. This disparity presents challenges when analyzing non-targeted metabolomics data sets and often requires custom methods to uncover information not always accessible via classical statistical techniques. In this work, we present a pipeline that combines a convolutional neural network with traditional statistical approaches and an adaptation of a genetic algorithm. The developed method was applied to a lifestyle intervention cohort data set, where subjects at risk of type 2 diabetes underwent an oral glucose tolerance test. Feature selection is the final result of the pipeline, achieved through classification of the data set via a neural network, with a precision-recall score of over 0.9 on the test set. The features most relevant for the described classification were then chosen via a genetic algorithm. The output of the developed pipeline encompasses approximately 200 features with high predictive scores, providing a fingerprint of the metabolic changes in the prediabetic class on the data set. Our framework presents a new approach which allows to apply complex modeling based on convolutional neural networks for the analysis of high-resolution mass spectrometric data.
Impact Factor
Scopus SNIP
Altmetric
8.008
1.455
Anmerkungen
Besondere Publikation
Auf Hompepage verbergern
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Sprache
englisch
Veröffentlichungsjahr
2022
HGF-Berichtsjahr
2022
ISSN (print) / ISBN
0003-2700
e-ISSN
1520-6882
Zeitschrift
Analytical Chemistry
Quellenangaben
Band: 94,
Heft: 14,
Seiten: 5474-5482
Verlag
American Chemical Society (ACS)
Begutachtungsstatus
Peer reviewed
POF Topic(s)
90000 - German Center for Diabetes Research
30202 - Environmental Health
30202 - Environmental Health
Forschungsfeld(er)
Environmental Sciences
PSP-Element(e)
G-501900-482
G-504800-001
G-504800-001
Förderungen
Deutsches Zentrum für Diabetesforschung (DZD)
WOS ID
WOS:000805334400004
Scopus ID
85127892157
PubMed ID
35344349
Erfassungsdatum
2022-07-26