Atabaki-Pasdar, N.* ; Ohlsson, M.* ; Viñuela, A.* ; Frau, F.* ; Pomares-Millan, H.* ; Haid, M. ; Jones, A.G.* ; Thomas, E.L.* ; Koivula, R.W.* ; Kurbasic, A.* ; Mutie, P.M.* ; Fitipaldi, H.* ; Fernández, J.* ; Dawed, A.Y.* ; Giordano, G.N.* ; Forgie, I.M.* ; McDonald, T.J.* ; Rutters, F.* ; Cederberg, H.* ; Chabanova, E.* ; Dale, M.* ; Masi, F.* ; Thomas, C.E.* ; Allin, K.H.* ; Hansen, T.H.* ; Heggie, A.* ; Hong, M.G.* ; Elders, P.J.M.* ; Kennedy, G.* ; Kokkola, T.* ; Pedersen, H.K.* ; Mahajan, A.* ; McEvoy, D.* ; Pattou, F.* ; Raverdy, V.* ; Häussler, R.S.* ; Sharma, S. ; Thomsen, H.S.* ; Vangipurapu, J.* ; Vestergaard, H.* ; 't Hart, L.M.* ; Adamski, J. ; Musholt, P.B.* ; Brage, S.* ; Brunak, S.* ; Dermitzakis, E.* ; Frost, G.* ; Hansen, T.* ; Laakso, M.* ; Pedersen, O.* ; Ridderstråle, M.* ; Ruetten, H.* ; Hattersley, A.T.* ; Walker, M.* ; Beulens, J.W.J.* ; Mari, A.* ; Schwenk, J.M.* ; Gupta, R.* ; McCarthy, M.I.* ; Pearson, E.R.* ; Bell, J.D.* ; Pavo, I.* ; Franks, P.W.*
Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts.
PLoS Med. 17:e1003149 (2020)
BackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.Methods and findingsWe utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n= 795) or at high risk of developing the disease (n= 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (<5% or >= 5%) available for 1,514 participants. We applied LASSO (least absolute shrinkage and selection operator) to select features from the different layers of omics data and random forest analysis to develop the models. The prediction models included clinical and omics variables separately or in combination. A model including all omics and clinical variables yielded a cross-validated receiver operating characteristic area under the curve (ROCAUC) of 0.84 (95% CI 0.82, 0.86;p <0.001), which compared with a ROCAUC of 0.82 (95% CI 0.81, 0.83;p <0.001) for a model including 9 clinically accessible variables. The IMI DIRECT prediction models outperformed existing noninvasive NAFLD prediction tools. One limitation is that these analyses were performed in adults of European ancestry residing in northern Europe, and it is unknown how well these findings will translate to people of other ancestries and exposed to environmental risk factors that differ from those of the present cohort. Another key limitation of this study is that the prediction was done on a binary outcome of liver fat quantity (<5% or >= 5%) rather than a continuous one.ConclusionsIn this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see:) and made it available to the community.
Impact Factor
Scopus SNIP
Web of Science
Times Cited
Scopus
Cited By
Altmetric
Publikationstyp
Artikel: Journalartikel
Dokumenttyp
Wissenschaftlicher Artikel
Typ der Hochschulschrift
Herausgeber
Schlagwörter
Alcoholic Steatohepatitis; Insulin Sensitivity; Global Epidemiology; Nafld; Biomarkers
Keywords plus
Sprache
englisch
Veröffentlichungsjahr
2020
Prepublished im Jahr
HGF-Berichtsjahr
2020
ISSN (print) / ISBN
1549-1277
e-ISSN
1549-1676
ISBN
Bandtitel
Konferenztitel
Konferzenzdatum
Konferenzort
Konferenzband
Quellenangaben
Band: 17,
Heft: 6,
Seiten: ,
Artikelnummer: e1003149
Supplement: ,
Reihe
Verlag
Public Library of Science (PLoS)
Verlagsort
1160 Battery Street, Ste 100, San Francisco, Ca 94111 Usa
Tag d. mündl. Prüfung
0000-00-00
Betreuer
Gutachter
Prüfer
Topic
Hochschule
Hochschulort
Fakultät
Veröffentlichungsdatum
0000-00-00
Anmeldedatum
0000-00-00
Anmelder/Inhaber
weitere Inhaber
Anmeldeland
Priorität
Begutachtungsstatus
Peer reviewed
Institut(e)
Molekulare Endokrinologie und Metabolismus (MEM)
Institute of Epidemiology (EPI)
POF Topic(s)
30201 - Metabolic Health
90000 - German Center for Diabetes Research
30202 - Environmental Health
Forschungsfeld(er)
Genetics and Epidemiology
PSP-Element(e)
G-505600-001
G-501900-405
G-504091-002
G-505600-003
Förderungen
Copyright
Erfassungsdatum
2020-06-23