PuSH - Publication Server of Helmholtz Zentrum München

Dietrich, S.* ; Floegel, A.* ; Troll, M. ; Kuhn, T.* ; Rathmann, W.* ; Peters, A. ; Sookthai, D.* ; von Bergen, M.* ; Kaaks, R.* ; Adamski, J. ; Prehn, C. ; Boeing, H.* ; Schulze, M.B.* ; Illig, T.* ; Pischon, T.* ; Knüppel, S.* ; Wang-Sattler, R. ; Drogan, D.*

Random survival forest in practice: A method for modelling complex metabolomics data in time to event analysis.

Int. J. Epidemiol. 45, 1406-1420 (2016)
Publ. Version/Full Text Supplement DOI PMC
Open Access Green as soon as Postprint is submitted to ZB.
BACKGROUND: The application of metabolomics in prospective cohort studies is statistically challenging. Given the importance of appropriate statistical methods for selection of disease-associated metabolites in highly correlated complex data, we combined random survival forest (RSF) with an automated backward elimination procedure that addresses such issues. METHODS: Our RSF approach was illustrated with data from the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam study, with concentrations of 127 serum metabolites as exposure variables and time to development of type 2 diabetes mellitus (T2D) as outcome variable. Out of this data set, Cox regression with a stepwise selection method was recently published. Replication of methodical comparison (RSF and Cox regression) was conducted in two independent cohorts. Finally, the R-code for implementing the metabolite selection procedure into the RSF-syntax is provided. RESULTS: The application of the RSF approach in EPIC-Potsdam resulted in the identification of 16 incident T2D-associated metabolites which slightly improved prediction of T2D when used in addition to traditional T2D risk factors and also when used together with classical biomarkers. The identified metabolites partly agreed with previous findings using Cox regression, though RSF selected a higher number of highly correlated metabolites. CONCLUSIONS: The RSF method appeared to be a promising approach for identification of disease-associated variables in complex data with time to event as outcome. The demonstrated RSF approach provides comparable findings as the generally used Cox regression, but also addresses the problem of multicollinearity and is suitable for high-dimensional data.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
7.522
2.762
31
39
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Keywords Cox Proportional Hazards Regression ; Exploratory Survival Analysis ; Metabolomics ; Multicollinearity ; Random Survival Forest ; Right-censored Data ; Type 2 Diabetes Mellitus ; Variable Selection; Type-2 Diabetes-mellitus; Serum Metabolomics; Insulin-resistance; Epic-germany; Metabolite Profiles; Variable Selection; Prediction Models; Cancer; Risk; Biomarkers
Language english
Publication Year 2016
HGF-reported in Year 2016
ISSN (print) / ISBN 0300-5771
e-ISSN 1464-3685
Quellenangaben Volume: 45, Issue: 1, Pages: 1406-1420 Article Number: , Supplement: ,
Publisher Oxford University Press
Publishing Place Oxford
Reviewing status Peer reviewed
Institute(s) Institute of Epidemiology (EPI)
Molekulare Endokrinologie und Metabolismus (MEM)
POF-Topic(s) 30202 - Environmental Health
90000 - German Center for Diabetes Research
30201 - Metabolic Health
Research field(s) Genetics and Epidemiology
PSP Element(s) G-504091-003
G-504000-001
G-501900-402
G-505600-003
G-505600-001
PubMed ID 27591264
Erfassungsdatum 2016-09-05